Mon. Not. R. Astron. Soc. 000, [01271(20 1 2) Printed 20 August 20 1 2 (MN KTeX style file v2.2) 



The ATLAS d project - XIX. Benchmark for early-type galaxies 
scaling relations from 260 dynamical models: mass-to-light ratio, 
dark matter, Fundamental Plane and Virial Plane 



Michele Cappellari, 1 * Nicholas Scott, 1 ' 2 Katherine Alatalo, 3 Leo Blitz, 3 Maxime Bois, 4 
Frederic Bournaud, 5 M. Bureau, 1 Alison F. Crocker, 6 Roger L. Davies, 1 
O ; Timothy A. Davis, 1 ' 7 P. T. de Zeeuw, 7 ' 8 Pierre- Alain Due, 5 

^ Sadegh Khochfar, 10 Davor Krajnovic, 7 Harald Kuntschner, 7 Richard M. McDermid, 11 
^ ; Raffaella Morganti, 12 ' 13 Thorsten Naab, 14 Tom Oosterloo, 1213 Marc Sarzi, 15 
Paolo Serra, 12 Anne-Marie Weijmans 16 and Lisa M. Young 17 

1 Sub-department of Astrophysics, Department of Physics, University of Oxford, Denys Wilkinson Building, Keble Road, Oxford OX1 3RH 

2 Centre for Astrophysics & Supercomputing, Swinburne University of Technology, PO Box 218, Hawthorn, VIC 3122, Australia 

3 Department of Astronomy, Campbell Hall, University of California, Berkeley, CA 94720, USA 

4 Observatoire de Paris, LERMA and CNRS, 61 Av. de I'Observatoire, F -7 5014 Paris, France 

5 Laboratoire AIM Paris-Saclay, CEA/IRFU/SAp CNRS Universite Paris Diderot, 91191 Gif-sur-Yvette Cedex, France 

6 Department of Astrophysics, University of Massachusetts, 710 North Pleasant Street, Amherst, MA 01003, USA 
(-H _ 7 European Southern Observatory, Karl-Schwarzschild-Str. 2, 85748 Garching, Germany 

. 8 Sterrewacht Leiden, Leiden University, Postbus 9513, 2300 RA Leiden, the Netherlands 
I ■ 9 Universite Lyon 1, Observatoire de Lyon, Centre de Recherche Astrophysique de Lyon 
^ ' and Ecole Normale Superieure de Lyon, 9 avenue Charles Andre, F-69230 Saint-Genis Laval, France 

| 10 Max-Planck Institutfur extraterrestrische Physik, PO Box 1312, D-85478 Garching, Germany 
r/^ , 11 Gemini Observatory, Northern Operations Centre, 670 N. A 'ohoku Place, Hilo, HI 96720, USA 

12 Netherlands Institute for Radio Astronomy (ASTRON), Postbus 2, 7990 AA Dwingeloo, The Netherlands 

13 Kapteyn Astronomical Institute, University of Groningen, Postbus 800, 9700 AV Groningen, The Netherlands 

14 Max-Planck Institut fiir Astrophysik, Karl-Schwarzschild-Str. 1, 85741 Garching, Germany 

15 Centre for Astrophysics Research, University of Hertfordshire, Hatfield, Herts AL1 9AB, UK 

16 Dunlap Institute for Astronomy & Astrophysics, University of Toronto, 50 St. George Street, Toronto, ON M5S 3H4, Canada 

17 Physics Department, New Mexico Institute of Mining and Technology, Socorro, NM 87801, USA 



O 

U 



> 

(N 
(N 

in 

cn 

od 
O 
(N 



Submitted to MNRAS on 16 August 2012 



© 2012 RAS 



2 M. Cappellari et al. 



ABSTRACT 

We study the volume-limited and nearly mass selected (stellar mass M s t a rs 6 x 10 9 M Q ) 

ATLAS 3D sample of 260 early-type galaxies (ETGs, ellipticals Es and lenticulars SOs). We 
construct detailed axisymmetric dynamical models, which allow for orbital anisotropy, in- 
clude a dark matter halo, and reproduce in detail both the galaxy images and the high-quality 
integral-field stellar kinematics out to about IR C , the projected half-light radius. We derive 
accurate total mass-to-light ratios (M/L) e and dark matter fractions /dm, within a sphere of 
radius r = R c centred on the galaxies. We infer masses Mjam = L x (M/L) e «2x M 1 / 2 , 
where M1/2 is the mass within a sphere enclosing half of the galaxy light. We also measure 
stellar (M/L) stars- We find that the thin two-dimensional subset spanned by galaxies in the 
(AfjAM, c e , i?J, nax ) coordinates system, which we call the Virial Plane (VP) has an observed 
rms scatter of 17%, which would imply an intrinsic one of just 4%. The VP satisfies the scalar 
virial relation A/jam = 5.0 x (jgi?" lax /G within our tight errors. This is true when (i) the 
major axis i?jf ax of the isophote containing half of the total projected galaxy light is used as 
galaxy radius, and (ii) a e is measured inside that isophote. This show that the larger scatter 
in the Fundamental Plane (FP) (L, er e , i?™ ax ) is due to stellar population effects (including 
trends in the stellar Initial Mass Function [IMF]) and confirms that the deviation from the 
virial relation is due to a genuine (M / L) e variation. However, the details of how both R c and 
a e are determined are critical in defining the precise deviation from this simple virial form. 
Differences in these details is the basic reason for a decades-old debate on the origin of the 
Fundamental Plane tilt. Even using excellent photometry, the main uncertainty in masses or 
M/L estimates using the scalar virial relation is in the measurement of R c : relative values 
are easy to obtain, but absolute normalizations are difficult to reproduce. This problem is al- 
ready relevant for nearby galaxies and may cause significant biases in virial mass and size 
determinations at high-redshift. Dynamical model can eliminate these problems. We measure 
a median dark matter fractions of /dm = 16% in our sample, which implies the total density 
profile within 1R C is dominated by the stellar density pstars- When approximated by a power- 
law the latter has the 'isotermal' form PstarsM °c r~ 2 - with an intrinsic rms scatter of 0.2 
for our sample. We revisit the (M/L) e — a e relation, which describes most of the deviations 
between the VP and the FP. The best-fitting relation is [MJ L) e oc 69 . Given that part of the 
(M / L) e increase with a e is already explained by systematic variations in age, metallicity (and 
possibly dark matter fraction), this relation provides an upper limit to any increase of the IMF 
mass normalization with a e . We find differences in the relation as a function of galaxy rotation 
and environment, with the correlation being slightly more shallow and with smaller scatter 
for slow rotating systems or for galaxies in Virgo. For the latter, when using the best distance 
estimates, we observe a scatter in (M/L) e of 10%, from which we infer an intrinsic one of 
just 6%. We perform an accurate empirical study of the relations between a e and the galaxies 
circular velocity within IR C and find the two empirical relations V C i rc (R™ ax ) re 1.51 X a e and 
max(T4i rc ) re 1.76 x a e , which are satisfied with small scatter and negligible dependence of 
<j e for our entire sample. The accurate parameters described in this paper are used in the com- 
panion Paper XX of this series to explore the variation of global galaxy properties, including 
the IMF, on the projections of the VP. 

Key words: galaxies: elliptical and lenticular, cD - galaxies: evolution - galaxies: formation 
- galaxies: structure - galaxies: kinematics and dynamics 



1 INTRODUCTION 

Scaling relations of early-type galaxies (ETGs, ellipticals E and 
lenticulars SO) have played a central role in our understanding of 
galaxy evo lution, since the discov ery that the stellar velocity dis- 
persion a (Faber & Jackson] 1 1976b and the galaxy projected half- 
light radius R e dKormendvl l 1 977 ) correlate with galaxy luminosity 
L. An important step forward was made with the discovery that 
these two relations are just p rojections of a relatively narrow plane , 
the Fundamental Plane (FP ) dFaber et alJl987l : lDressler et aljl987l : 
Djorgovski & Da visll987h . relating the three variables (L, a e ,R e ). 
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When the plane is used as a distance indicator, as was especially 
the case at the time of its discovery, the luminosity can be replaced 
by the surface brightness within R c as E e = L/(2irR c 2 ) and the 
observed plane assumes the form 

Re OC cr 1 - 3 ^ ' 82 , (1) 

where the adopted paramet ers are the median of t he 1 1 independent 
determinations tabulated in lBemardietaI1 ( l2003l) . 

It was immediately realized that the existence of the FP 
could be due to the ga laxies being in virial equilibrium (e.g. 
iBinnev & Tremain e 2008) and that the deviation (tilt) of the coeffi- 
cients from the virial predictions R e oc a 2 E^ x , could be explained 
by a smooth power-law variation of mass-to-light ratio M/L with 
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mass jFaberetal.ll 19871) . The FP showed that galaxies assemble 
via regular processes and that their properties are closely related 
to their mass. The tightness of the plane gives constraints on the 
variation of stellar population among gala xies of similar charac- 
teristics and on their dark matter content (Renzini & Ciottj[l993l : 
iBorriello et alj|2003h . The regularity also allows one to use the FP 
to study galaxy evolution, b y tracing its variations with redshift 
J van Dokkum & Franxll 19961) . 

However, other reasons for the deviation of the coefficients are 
possible: the constant coefficients in the simple virial relation only 
rigorously apply if galaxies are spherical and homologous systems, 
with similar profiles and dark matter fraction. But both galaxies 
concentration ( Caon et al J 1993h and the amount of random motions 
in their stars jDavies et al. 19831) were found to systematically in- 
crease with galaxy luminosity. 

The uncertain origin of the tilt led to a large number of in- 
vestigations about its origin, exploring the effects of (i) the sys- 
tematic variation in the stellar population, or (ii) the non-homology 
in the surface brightness distribution or (iii) the kinematic, or (iv) 
the variation in the amount of dark matter, on the FP tilt and scat- 



ter (Renzini & Ciotti 1993; Prugniel & Simien 


Il994l.ll996t 


1 19971: 


Ciotti et al.l 


1996; Graham & Colless 


1997; 


Forbes et al. 


|l998|; 


Bertin et al. 2002: Borriello et al.ll2003l: 


Truiillo et alj|2004n. Those 



works were all based on approximate galaxy spherical models, try- 
ing to test general hypotheses and not reproducing real galaxies in 
detail, which sometimes led to contrasting results. What became 
clear however was that various effects could potentially influence a 
major part of the FP tilt. Moreover it was found that the scatter in 
the FP implies a well regulated formation for ETGs. 

The next step forward came with subsequent studies, which 
instead of testing general trends, used small samples of objects and 
tried to push to the limit the accuracy of measuring galaxy central 
masses, while reducing biases as much as possible. Those accurate 
total masses could be directly compared to the simple virial ones, 
testing for residual trends. Similar but independent studies were 
performed using two completely dif ferent techniques, either stel- 
lar dy namics JCa ppellari et al. 2006) or strong gravitational lens- 
ing ( iBolton et alj200ll2008l ; iAuger et alfeOlpj) . The results from 
those efforts agree with each others, and showed that the tilt of the 
FP is almost entirely due to a genuine M/L variation. 

In this paper we investigate once more the origin of the FP tilt. 
This new study is motivated by the dramatic increase in the size 
and quality of our galaxy sample, with respect to any p revious sim- 
ilar s tudy. We have in fact state-of-the-art SAURON dBacon et alj 
l200lh stellar kinemati cs for all the 260 ear ly-type galaxies of 
the ATLAS 3D sample dCappellari e"t~aT1l2011al hereafter Paper I), 
which constitute a volume-limited and carefully selected sample of 
ETGs, down to a stellar mass of about Af sta rs H x 10 9 M0. This 
fact, combined with detailed dynamical models for the entire sam- 
ple, allows us to test previous claims with unprecedented accuracy. 
The new models also include a dark matter halo and give constraints 
on the dark matter content in the centres of ear ly-type galaxies. 
These m easurements will be used in the companion lCappellari et alj 
d2012al . hereafter Paper XX) to provide a novel view of galaxy scal- 
ing relations. 

In what follows, in Section 2 we present the sample and data, 
in Section 3 we describe the methods used to extract our quantities, 
in Section 4 we present our results on the FP tilt, dark matter and the 
(M/ L)a relation, and finally we summarize our paper in Section 5. 



2 SAMPLE AND DATA 
2.1 Selection 

The galaxies studied in this work are the 260 early-type galax- 
ies which constitute the volume-limited and nearly mass-selected 
ATLAS 3D sample (Paper I). The object were morph ologically se- 
lected as early-type according to the classic criterion dHubblel 19361 : 
Ide Vaucouleurj(l959l : [Sandage]|l96ll) of not showing spiral arms or 
a disk-scale dust lane (when seen edge-on). The early-types are ex- 
tracted from a parent sample of 871 galaxies of all morphological 
typ es brighter than Mk = —21.5 mag, using 2MASS photome- 
try dSkrutskie et al.ll2006r) . inside a local (D < 42 Mpc) volume of 
1.16 x 10 5 Mpc 3 (see full details in Paper I). 



2.2 Comparison to previous samples: dynamics and lensing 

Our goal is to measure total masses, or equivalently mass-to-light 
ratios (M/L), in the central regions of galaxies. M/L of signif- 
icant samples of individual ET Gs have been previou sly obtained 
via dynamical model ling (e.g. Ivan derMarel 1991 [37 ETGs]; 



Magorrianetal]ri998 [36 ETGs] ; [Gerhard et al. 2001 [21 ETGs]; 



Cappellari et al. 2006 [25 ETGs]; Thomas et al. 2007b [16 ETGs]; 
Willia ms etalj|2009l [14 ETGsl; [Scott et al.ll2009l T4 8 ETGs]) or 



strong gravitational le nsing (e.g. Rusin et al. 2003 [22 ETGs]; 
Koopmans et"ai]|2006l [15 ETGsl; IBolton et alj|2008l [53 ETGs]; 
Auger et alj|2010al [73 ETGs]). An important, and perhaps not ob- 
vious, difference between the quantities obtained with the two tech- 
niques is that the dynamical models provide masses enclosed within 
a spherical radius, while strong lensing measures the mass inside a 
cylinder with axis parallel to the line-of-sight. Care has to be taken 
when comparing t he two methods. A n ice illustration of this fact is 
given in figure 1 of iDutton et alj d2011ah . 

An advantage of the strong lensing technique is that the re- 
covered mass inside a cylinder with the radius of the Einstein ring 
is nearly insensitive to the mass distribution, and completely in- 
dependent on the stellar dynamics. However, the requirement of a 
galaxy to act as a strong lens, necessarily imposes biases in the 
objects selection, and in particular limits mass measurements via 
st rong lensing to the most massive nearby ETGs (a > 200 km s 
in lAuger et alj|2010ah . 

The dynamical modelling technique has the significant ad- 
vantage that it can in principle be applied to any bound system 
made of stars. However, it requires a detailed treatment of the 
observed surface brightness and orbital distribution, in combina- 
tion with integral-fiel d data, for robust and accurate values (e.g. 
ICappellari etal]|2006l) . 

In this paper we apply the dynamical modelling technique to 
the ATLAS 3D sample of 260 early-type galaxies. This increases the 
sample size for which accurate total masses have been measured by 
a factor of four. Moreover the sample is volume-limited and statis- 
tically representative of the nearby galaxy population with stellar 
mass M s tars > 6 x 10 9 Mq and in particular includes ETGs with 
velocity dispersion as low as a e ~ 40 km s -1 (see Paper I for an 
illustration of the characteristics of the sample). 



2.3 Stellar kinematics and imaging 

Various multi-wavelengths datasets are available for the sample 
galaxies (see a s ummary in Paper I). In this work we make use 
of the SAURON dBacon et al1l200lh integral-field stellar kinemat- 
ics within about one half-light radius R c , which was introduced 
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in Emselle m et alj (|2004|). for the sub set of 48 early-types in the 
SAURON survey l de Zeeuw et alj|2002h . and in Paper I for the rest 



of the ATLAS sample. Maps of the stellar velo city for all the 260 
galaxies were presented in Krain ovic et alj20lTI . hereafter Paper II. 

In this paper we are not interested in the shape of the stellar 
line-of-sight velocity distribution (LOSVD), but we wa nt to approx- 
imate veloci ty moments which are predicted by the dJeanslll922D 
equations. In Cappellari et al. U2007I) we used semi-analytic models 
to compute a set of r ealistic galaxy LOSVD s with known veloc- 
ity mome nts, usin g the Hunter & Oianl d 19931) formalism, as imple- 
mented in lEmsellem et alj (11999). The models LOSVDs were used 
to broaden galaxy spectral templates and noise was subsequently 
added. The kine matics was then extract e d fro m the synthetic spec- 
tra using pPXF|Cappellari & Emsellern] d2004f) as done for the real 
galaxies. We found that Kms = VV 2 + a 2 , where V and a are the 
mean and standard deviation of the best fitting Gaussian provide 
a better empirical approximation to the velocity second moment 
(^loa) 1 ^ 2 man an integral of a more g eneral LOSVD described by 
the Gauss-H ermite parametrization 1 van der Marel & Franx|[l993l : 
iGerharJ 19931) . This is due to the large sensitivity of the moments to 
the wings of the LOSVD, which are observationally ill determined. 
For this reason all the kinematic quantities used in the paper are 
extracted using a simple Gaussian LOSVD in the pPXF software 
(keyword MOMENTS=2). 

The photometry us ed in this work co mes from the Sloan Dig- 
ital Sky Survey (S DSS ^York et alj|20"ooh data release eight (DR8 
lAihara et alj|201 ll) for 225 galaxies and was supplemented by our 
own photometry taken at the 2.5-m Isaac Newton Telescope in the 
same set of filters and with comparable signal to noise for the rest 
of the sample galaxies ( Sco tt et al]|2012l . hereafter Paper XXI). 



3 METHODS 

3.1 Measuring galaxy enclosed masses 

3.1.1 Choosing the dynamical modelling approach 

Various dynamical modelling techniques have been developed in 
the past. They are all characterized by their ability to repro- 
duce in detail, in a non-parametric way, the characteristics of 
the galaxy surface brightness. Th is contrasts with a more qual- 
itativ e toy-model approach (e.g. iTortora et alj 120091 ; iTreu et all 
|2010[) that assume a spheri cal shape and a simpler parametriza- 
tion (e. g. iHemauistl I l99d or ISersid 1 19681 profile) for the sur- 
face brightness of all galaxies. An accurate description of the 
galaxy surface brightness is a necessary requirement for quan- 
titative and unbiased measurements of dynamical quantities as 
much of the kinematic i nformation on re al galaxies is contained 
in the photometry alone dCappellarill2008h. The state of the art in 
the field is currently represented by Sch warzschildl dl979h orbit- 
superposition approach, which was originally developed to repro- 
duce galaxy stellar densities and wa s later generalized to produce 
detailed fits to the stellar kinematics (Richstone & Tremainel ll988l : 



iRix et al Jl997l ;l van der Marel et al J 19981) and has been widely used 
for determinations of masses of supermassive b lack holes (e.g. 
van der Marel et alj 1 1 9971 ; iGebhardt et alj l2000al ; ICappellari et alj 



20021: IValluri et alj 120041; iHo ughton et al. 2006) , for galaxy mass 



Thomas et al 



determinations (e.g. ICappellari et alj|2006l 
and to recover orbita l distributions (e.g. Krainovic et al j 1200: 
Cappe llari et aD 120071 ; Ivan den Bosch et alj l2008l ;lThomas et al 



2007 



1996) as impl emented to reproduce k i nematical obse r vables by var- 
ious groups dde Lorenzi etai] |2007|; Ibehnenl 120091 : lLong & Mad 
2010). When the gravitational potential is assumed to be known, 
and the particles are chosen to fully sample all integrals of motion, 
the method effectively corresponds to a particle-based analogue of 
Schwarzschild's method, and is expected to provide similar results. 
However, the method may be very useful when the potential is de- 
rived from the particles in a self-consistent way. Not much however 
is known about the convergence and uniqueness of the solution in 
this case. 

The sophistication and generality of the dynamical models has 
reached a level that exceeds the amount of information that the 
observations of external galaxies can provide. As a result the ob- 
servations are unable to uniquely cons train all the model param- 
eters, which suffer f rom degeneracies (Dejonghe & Merritt j[l992l; 
Gerha rd et all 1 19981 ; Ide Lorenzi etall 120091 : iMorganti & Gerhard! 



20121) . A key degeneracy is in the deprojection of the observed 



20091) . A close contender technique, but not a s widely used, is 



the particle-based made-to-measure method of dSver & Tremaind 



surface brightness into a three dimensional stellar mass d istribu- 
tion, which has been proved to be of mathematical nature (Rybicki 
1 19871 ; I Gerhard & Binnevll 19961) and applies even when the galaxy 
is assumed to be axisymmetric. However, similar degeneracies are 
likely to exists when higher (than zero) moments of the velocity 
are considered. This is expected from dimensional arguments: the 
current data provide at most a three-dimensional observable (an 
integral-field data cube), which is the minimum requirement to con- 
strain the orbital distribution, which depends on three integrals of 
motion, for an assumed potential and known light distribution. It 
is unlikely for the data to contain enough information to constrain 
additional para meters, like the dar k matter halo shape and the view- 
ing angles (e.g. lValluri et al.l2004l) . Numerical experiments confirm 
that even with the best available integral-field stellar kinematics, 
and assuming the gravitational potential is known and axisymmet- 
ric, even the galaxy inclination cannot be rel iably inferred from the 
data using general Schwarzschild models (Krainovic et al. 2005; 
ICappellari et alj|2006l : lvan den Bosch & van de Venll2009h . 

The situation becomes even more problematic when one con- 
siders the fact that the majority of early-type galaxies are likely to 
have bars. 30% have obvious bars (Paper II) in the ATLAS 3D sam- 
ple, but more must be hidden by projection effects. Bars are char- 
acterized by figure rotation which is ignored by most popular mod- 
elling approaches. The treatment of bars could be inclu ded in the 
mode ls as demonstrated in the two-dimensional limit by (Pfenniger 
1 1984 ) and as done to models t he Milky Way in th ree dimension 
dZhaolll996l : lHafner et alj|200d: iBissantz et alj|2004l) . However, no 
applications to external galaxies exists. This is due to the extra de- 
generacy that the addition of at least two extra model parameters, 
the bar pattern speed and position angle, will produce on an already 
degenerate problem. This combines with the dramatic increase in 
the non-uniqueness of the mass depr ojection expect ed in a triaxial 
rather than axisymmetric distribution jGerhardl 19961) and in the ad- 
ditional unavoidable biases introduced by observational errors. All 
this is expected to further broaden the minima in the \ 2 distribu- 
tions of the fits and increase the uncertainties and covariances in 
the recovered parameters. 

We chose a different approach. Rather than allowing for the 
full generality and degeneracies of the models, we adopt a mod- 
elling method that makes empirically-motivated assumptions to re- 
strict the range of model solutions and improve the accuracy of 
the mass recovery. This is motivated by the finding that the kine- 
mati cs of real fast-rotator early-type galaxies in the SAURON sam- 
ple dde Zeeuwet ai] |2002l) is well approximated by models char- 
acterized by a remarkably simple and homogeneous dynamics, 
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characterized by a cylindric ally-aligned and nearly oblate veloc- 
ity ellipsoid a$ w cjr > <j z (Cappellari 200^), as previously sug- 
gested by more general S chwarzschild's models dCappellari et alj 
l2007l ; lThomas et alj2 009). The models are called Jeans Anisotropic 
MGE (JAM), where MGE stands for the Multi-Gaussian Expan- 
sion method of lEmsellem et alj dl994h . that is used to accurately 
describe the galaxy photometry. The JAM models can reproduce 
the full richness of the observed state-of-the-art SAURON integral- 
field kinematics of fast rotator ETGs using just two free param eters 
dCappellarill2008l : IScott et alj|2009l: ICappellari etai]|2012bl) . pro- 
viding a compact description of their dynamics. The JAM models 
are ideal for this work given that the nearly-axisymmetric fast ro- 
tator ETGs constitute the 86% of the ATLAS 3D sample (Paper II; 
lEmsellem et alj201 ll hereafter Paper III). Moreover the JAM mod- 
els only require the first two velocity moments (V and a), and not 
the full LOSVD, which is not available for about half of the sample 
(see Paper I). The JAM models do not have the freedom to actually 
fit small-scale details of the kinematics, but they make a prediction 
based on an accurate description of the photometry and a couple of 
parameters. This constitutes an advantage in presence of noise and 
systematics in the data, as it makes spurious features easy to recog- 
nize and automatically exclude from the fit. Moreover the approach 
is at least three orders of magnitudes faster than Schwarzschild's 
approach. 

Not all ETGs are well described by the JAM models however: 
some of the slow rotators in ATLAS 3D are likely nearly spherical 
in the region where we have stellar kinematics, but about 10% of 
the sample galaxies are weakly triaxial or out of equilibrium (Pa- 
per II). For those objects the modelling results should be treated 
with caution. Errors of up to 20% can arise w hen measuring masses 
of triaxial objects with axisymmetr ic models dThomas et alj200"7al ; 
Ivan den Bosch & van de Venl2009h and this should be kept in mind 
when interpreting our results. However, preliminary tests using real 
galaxies in the SAURON sample indicate excellent agreement be- 
tween the M I L recover y using axisymmetric models and triaxial 
ones with identical data ( van den Bosch 2008). Moreover, in what 
follows, unless explicitly mentioned, we verified that all conclusion 
are unchanged if we remove the slow rotator galaxies from the sam- 
ple. Barred galaxies provide a further complication, which will be 
discussed in the next Section. 

3. 1.2 JAM models with dark halo 

In practice the modelling approach we use in this paper starts by ap- 
proximating the observed SDSS and INT r-band surface brightness 
distribution of the ATLAS 3D galaxi es using the Multi-Ga ussian 
Expansion (MGE) parametrization ( lEmsellem et alj Il994). with 
th e fitting me t hod a nd MGE_FIT_SECTORS software package 
of Cappellari (2002fl The choice of the photometric band is a 
compromise between the need of using the reddest band, to re- 
duce the contamination by dust, and the optimal signal-to-noise in 
the images. For barred galaxies the Gaussians of the MGE models 
are constra i ned to have the flattening of the outer disk, following 
IScott et alj d2009l. their fig. 4). Full details of the fitting approach 
and illustrations of the quality of the resulting MGE fits are given 
in Paper XXI. The MGE models are used as input for the JAM 
method dCappellaril2008l) which calculates a prediction of the line- 
of-sight second velocity moments {vf OB } for given model parame- 
ters and compare this to the observed Vims- 

1 Available from http://purl.org/cappellari/idl 
© 2012 RAS, MNRAS 0Q0.mi27l 



In Cappellari et al. (2006) it was shown that, when the sur- 
face brightness distribution is accurately reproduced and good qual- 
ity integral-field data are available, simple two-integral Jeans mod- 
els measure masses nearly as accurate as those of Schwarzschild's 
models, with errors of 6%. The agreement can be further improved 
by allowing for orbital aniso tropy, in which c ase the two methods 
give equally accurate results ( Cappell aril2008h . We have run an ex- 
tensive set of tests usi ng JAM to determine the M/L of realistic 
numerical simulations (Lablanche et al. 2012, hereafter Paper XII). 
We found that for unbarred galaxies, even when the anisotropy is 
not accurately constant inside the region with kinematic data, the 
M/L can be recovered with maximum biases as small as 1.5%. 
The situation changes when the galaxies are barred. In this case bi- 
ases of up to 15% can be expected for the typical bar strengths we 
find in ETGs. 

The mode l s we use here were already presented in 
Capp ellari et ai] d2012bl) . where they were used to uncover a sys- 
tematic variation of the stellar IMF in ETGs. That paper (their ta- 
ble 1) describes six sets of JAM models for all the ATLAS 3D galax- 
ies, making various assumptions on the dark matter halo. Given that 
the SAURON data are typically spatially limited to IR C one can- 
not expect to be able to robustly ch aracterize the shape of t he dark 
halo out to large radii from them dMamon & Loka sl l2005l) . How- 
ever, as long as the radial profile and the flattening of the halos are 
not both the same as those of the stellar distribution, we can de- 
termine how much room the models allow for a dark matter halo, 
within the regi on constrained by the ki nematics. The models were 
summarized in Cappe llari etafl d2012bl) . but we describe them here 
in some more detail using the same lettering notation as that paper: 

(A) Self-consistent JAM model: Here we assume that the mass 
distribution follows the light one as inferred from the de- 
projected MGE. In this case the model has three free parame- 
ters. Two parameters are non-linear: (i) the vertical anisotropy 
Pz = 1 — cr\ /erfj and (ii) the galaxy inclination i, which to- 
gether uniquely specify the shape of the second velocity mo- 
ment (vf os ), which is then linearly scaled by the (M/L)jam 
to fit the two-dimensional V rm s data. We emphasize that, 
even though the models do not include a dark halo explicitly, 
(M/ L) jam does not represent the stellar M/L, as sometimes 
incorrectly assumed, but the total one, within a spherical re- 
gion which has the projected size of our data (see discussion in 
Section |4.1.2| (. This set of models, like all others, has a central 
supermassi ve black hole with mass predicted by the Mbh — a 
correlation dGebhardt et alJl2000bl : lFerrarese & MerrinfeOOd) . 
or a black holes mass as published, when available. The su- 
permassive black hole has a minimal effect on M/L in nearly 
all cases, but we still exclude the central R < 2" from the 
fits, for maximum robustness. Examples of mass-follows-light 
JAM models are shown in Fig.Q] 

(B) JAM with NFW dark halo: Thi s set of models adopted the 
approach introduced by iRix et al I d 19971) to reduce the halo 
to a one-parameter family of models. This approach was al- 
ready used with axisymmetric JAM m odels of disk galaxies, 
as done here, by I Williams et alj (|2009) and t o construct spher- 
ical toy models of variou s stellar systems dNapolitano et alj 
120051 ; iTollerud et alJl2oTlh . We assume the halo is spherical 
and characteri zed by the two-para meters double power-law 
NFW profile dNavarro et all 1 19961) . We then adopt the halo 
mass-concent ration M200 — C200 relation dNavarro et all 19961) 
as given by dKlypin et al .1 120 1 H) to make the halo profile a 
unique function of its mass A/200 • The latter is not a critical 



6 M. Cappellari et al. 




-20-10 10 20 



-10 10 



-20 -10 10 



-20-10 10 



-15-10-5 5 1015 



-1513-50 5 1015 



Figure 1. Mass-follows-light JAM model examples. In each panel, the top plot shows the by-symmetrized SAURON data, with the observed galaxy surface 
brightness overlaid, in steps of 0.5 mag. The bottom plot shows the best-fitting JAM model, with the MGE surface brightness assumed by the model. The 
models (A) have just two free non-linear parameters, the inclination and the global anisotropy [i, /3 Z ), to reproduce the shape of the observed second velocity 
moments V Ims = \/V 2 + a 2 , where V is the mean stellar velocity and a is the stellar velocity dispersion. Yet, once the surface brightness is given, most of 
the variety in our maps can be reproduced. The most significant deviations between data and models are due to bars, recognizable from the asymmetries in the 
observed surface brightness. The predictive power of these simple JAM models qualitatively suggest that the assumed total potential may not be significantly in 
error, which would imply dark matter is unimportant (or accurately follows the light). The good fits shows that ETGs have a relatively simple dynamics within 
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assumption: our observations only sample a region well inside 
the predicted halo break radius, so that all our conclusion are 
unchanged if we describe the halo with a simple power law 
density profile p(r) oc r -1 , as we numerically verified. The 
resulting JAM models have in this case four parameters: (i) 
The galaxy inclination i (ii) the anisotropy B z , (iii) the stel- 
lar (M/I/) st ars, assumed spatially constant and (iv) the halo 
virial mass M200, defined as the mass within the spherical ra- 
dius r2oo at which the average density is equal to 200 times the 
critical density of the Universe. 

(C) JAM with contracted NFW dark halo: These models in- 
clude a halo which is originally assumed to be of NFW form, 
with concentration specified by its mass via the M200 — 
C200 relation as in (B). However, during the fitting process, 
for every choice of the model parameters, the halo is con- 
tracted according to the enclosed stellar mass distribution, 
which is defined by the (circularized) MGE and the corre- 
sponding (Af/Z/) st ars parameter. For th e contraction we used 
th e prescription of iGnedin et alj d201ll) . which is an update 
of lGnedin etaT] J2004h . We verified that our IDL code pro- 
duces t he same output as the C language software CON- 
TRA by IGnedin et all d2004l) . when the same input is given. 
The resulting JAM model has the same four free parameters 
(i,p x , (M/L) st ars, M200) as in (B). 

(D) JAM with general dark halo (gNFW): These models in- 
clude a dark halo th at generalizes the NFW profile (see also 
Barn abietalJl2012h 



Pdm(t) = p s ( — 



1 1 r 

2 + 2~ s 



-7-3 



(2) 



The density has the same asymptotic power-law slope ft — — 3 
as the NFW halo, but it allows for a variable inner slope, 
which we constrained to the bounds —1.6 < 7 < 0, 
by assigning zero probability to the prior P(model) = 
(Section [3.1.3b outside this parameters range. The ranges in- 
clude a flat inner core 7 = and the NFW 7 = — 1. 
The upper bound was chosen as the nearly maximum slope 
we measured for all contracted halos in (C) (top panel of 
Fig.[2](. However, recent simulations suggest that baryonic ef- 
fects produce flatter halos than these predic tions for a broad 
range of galaxy masses dDuffv et alJl201Ct iGov ernato et al. 



file without any free parameter. During the fitting process the 
halo mass M200 is determined from the enclosed stellar mass 
Af s t ar a, which is given by the total luminosity of the MGE 
model multiplied by its current (M/ L) B tars- This is done u sing 
the M200 — M s tars relat ion derived bv lMoster etaflfeOlCl) (see 
also [Moster et~aT, I l2012h . which matches the observed galaxy 
luminosity functions to the simulated halos mass function. 
However, negligible differences would have been obtained us - 
in g e.g. the similar r elations derived bv lBehroozi et alj d2010h 
or iGuo et al. I fcfJlCh . For a given halo mass, the concentra- 
tion is specified by the M200 — C200 relation as in (B). The 
only free model parameters are the three of the stellar com- 
ponent (i, p z , (M/Z/) st ars) as in (A). This fixed-halo assump- 
tion, in c ombination howev er with equally fixed spherical and 
isotrop ic Hernquist ( 1990) galax y models, w as also used by 
lAuger et alj feOlObf) and lDeason et alj l2012h . 
(F) JAM with fixed contracted dark halo: The halo has a con- 
tracted profile without any free parameter. For a given stel- 
lar mass, the halo has initially the same NFW form as in (E), 
bu t the profile i s contr acted as in (C) using the prescription 
of IGnedin et alj d201 ll) . The only free model parameters are 
the three of the stellar component (i,j3 z , (M/L) s tars) as in 
(A). This fixed-halo assumption, in co mbination howev er with 
equally fixed spherical a nd isotrop ic Hernquisn d 19900 galaxy 
models, was also used bv lAuger et alj ( feOlObh . 



3.1.3 Bayesian inference of the JAM model parameters 

The determination of the JAM model parame ters for the 260 
ATLAS 3D galaxies in Cappellari etafl d2012bh was done using 
Bayesian inference dGelman et alj|2004l). The sam e approach was 
adopted using JAM models in lBarnabe et "all d2012h in combination 
with gravitational lensing. From Bayes theorem, the posterior prob- 
ability distribution of a model, with a given set of parameters, given 
our data is 



P(model I data) oc P(data | model) x P(model). 



(3) 



Here we make the rather common assumption of Gaussian errors, 
in which case the probability of the data, for a given model is given 
by 



20 1 ; Inoue & S aitoh 1201 ll; IPontzen & Governatol 120121: 



Laporte et alll2012l : iMaccio et a l. 2012; Martizzi et al. 20 lj). P(data | model) oc exp 



Note that our adopted maximum halo slope is still gener- 
ally more shallow than the typical 'isothermal' average power 
slope 7' = 2.0 the we measure for the stellar density alone 
within 1P C (bottom panel of Fig. [2}- This fact is important to 
avoid model degeneracies between the stellar and halo densi- 
ties. This model is the most general of all six and it includes 
any of the other five models as special cases. It has five free pa- 
rameters: (i) the galaxy inclination, (ii) the anisotropy /3 Z , (iii) 
the stellar mass M st ars, (iv) the halo inner slope 7 and (v) the 
halo density p s at r 3 , which we parametrized using the dark 
matter fraction fDAi(r = R e ) to reduce the strong correlation 
between p a and 7 during the parameter estimation. The break 
radius r s of the halo was not included as a free parameter given 
that it is (in models E) generally 3-5 times larger than R c and 
it is completely unconstrained by our data. We fixed r s — 20 
kpc, which is the median value for all models E, but we veri- 
fied that nearly identical results are obtained if we describe the 
halo with a simple power-law density profile p(r) oc r -7 ; 
(E) JAM with fixed NFW dark halo: The halo has a NFW pro- 



2 



with 




(4) 



(5) 



We further assume a constant noninformative prior P(model) for 
all variables within the given bounds. 

The calculatio n of the posterior distr ibution is performed 
using the adaptive iMetropolis et alj dl953l) (AM) algorithm of 
Haari o~et"al] d200ll) . The AM method adapts the multivariate Gaus- 
sian proposal distribution during the Markov chain Monte Carlo 
sampling, in such a way that the Gaussian proposal distribution has 
the same non-diagonal covariance matrix as the posterior distribu- 
tion accumulated so far by the algorithm. This natural idea is similar 
to what is routinely done e.g. in the determination of cosmological 
parameters, where the covaria nce matrix of the pos terior is calcu- 
lated after a burn-in phase (e.g. iDunklev et alll2005h . However, the 
adaptive approach converges much more rapidly as the proposal 
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Figure 2. Inner slope of contracted dark halos and luminous matter. Top 
Panel: Histogram of the halo slope of contracted halos for all 260 ATLAS 3D 
galaxies in model (C). The slopes were determined by fitting a power law 
relation pdm (r) <x r 7 inside the radius r < r s /A, where we verified the 
contracted halo profiles are accurately described by a power law. Bottom 
Panel: Histogram of the slope of the deprojected stellar mass density distri- 
bution from the MGE models. The slope was fitted inside a spherical radius 
r = Re ■ Although the stellar density p*(r) <x r 7 inside that radius is not 
always accurately described by a power-law, on average the stellar slope 
peaks with high accuracy at at the 'isothermal' value 7' Ri 2.0, with an 
intrinsic scatter of just a = 0.24 for our entire sample. 



distribution starts approaching the posterior already after a few 
points have been sampled. We found the adaptive approach abso- 
lutely critical for the speed up of our calculation by orders of mag- 
nitudes, given the strong degeneracies between the model param- 
eters producing inclined and narrow posterior distributions. Some 
examples of the posterior distributions obtained with our approach 
are shown in Fig. [3] Although the adaptive nature of the AM algo- 
rithm makes the resulting chain non-Markovian , their authors have 
proven that it has the correct ergodic properties dHaario et al. 2001) 
and for this reason it can be used to estimate the poster ior distribu- 
tion as in standard Markov chain Monte Carlo methods dGilks et alj 

Moreover, to basically eliminate the burn-in phase of the AM 
method, we use the efficient and extremely robust DIRECT de- 
terministic global optimization algorithm of I Jones et al.1 d 1993b to 
find the starting location without the risk for the Metropolis stage 



to be stuck in a possible secondary minimum in multi-dimensional 
parameter space. 

An important addition to the fitting process is an iterative 
sigma clipping of the kinematics, to remove spurious features in the 
data like stars or problematic bins at the edge of the SAURON field 
of view. This is important for a sample of the size of ATLAS 3D , 
where the quality of every Voronoi bin cannot be assessed manu- 
ally for all galaxies. After an initial fit the few bins deviating more 
than 3cr of the local rms noise are excluded from the fit and a new 
fit is iteratively performed, until convergence. 



3.2 Robust fitting of lines or planes to the data 

3.2.1 Goodness of fit criteria 

The apparently simple task of fitting linear relations or planes to 
a set of data with errors does not have a well defined and ob- 
vious solution and for this reason has continued to generate sig- 
nificant interest. A number of papers have di scussed the solu- 
tion of the corresponding least-squares problem dlsobe et alJI 19861: 
Feigelson & Babulll992ljAta-itas & Bershadvl[l996l ; iTremaine et"ai] 
20021 : iPress et alj|2007l) . while more rec ent works ha v e addressed 
the p roblem using Bayesian methods dKellvl 120071 : iHogg et alj 
2010). A popular m ethod is the least-squares approach by 
Tremaine et alj d2002f). which i s an ex tension of the FITEXY pro- 
cedure described in lPress et alj d2007l section 15.3). The method 
defines the best fit of the linear relation y — a + b(x — xq) to a set 
of N pairs of quantities (xj,yj), with symmetric errors Axj and 
Ayj , as the one that minimizes the quantity 

2 [a + Hxj - x Q ) - y 3 ] 2 

X 2^ {b A Xj y + (Ay.r + el W 
3=1 

Here xo is an adopted reference value, close to the middle of the 
Xj values, adopted to reduce uncertainty in a and the covariance 
between the fitted values of a and b. While e y is the intrinsic scat- 
ter in the y coordinate, which is iteratively adjusted so that the % 2 
per degree of freedom v = N — 2 has the value of u nity expected 
for a good fit. As recognized bv lWeineretaI1d2006t) . minimizing 
the above x 2 corresponds to maximizing the likelihood of the data 
for an assumed intrinsic probability distribution of the observables 
described by the linear relation y = a + b(x — xo) + e v , where 
e y is the Gaussian scatter projected along the y coordinate, and one 
assumes a uniform prior in the x coordinate, equation ^ is only 
rigorously valid when the errors in x and y are Gaussian and uncor- 
rected (have zero covariances). A term —2b Cov(xj , j/j ) should be 
included i n the denominator if the cov ariances are known and non- 
zero (e.g. iFalcon-Barroso et ai]l20T l|). The la confidence interval 
in t y can b e obtained by findin g the values for which \ 2 = "± V2v 
as done bv lNovak et al.1 d2006h . The apparent asymmetry of equa- 
tion $6$ with respect to the x and y variables does not imply we 
assume only the y variable has intrinsic scatter. In fact the assumed 
intrinsic distribution has a Gaussian cross section along any direc- 
tion non parallel to the ridge line y — a + b(x — xq). The value e y 
merely specifies the dispersion along the arbitrary y direction. The 
formula would give completely equivalent results by interchanging 
the x and y variables if the distribution of x values was uniform and 
infinitely extended as assumed. Any difference in the fitting results 
when interchanging the x and y coordinates are due to the breaking 
of the uniformity assumptions. 

equation ([6} can be generalized to plane fitting by defining the 
best-fitting plane z — a + b(x — xo) + c(y — yo) to a set of TV 
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Figure 3. Examples of JAM dynamical modelling with general dark halo using Adaptive Metropolis. Each panel shows the posterior probability distribution 
for the parameters (q, /3 Z , /dm, 7), using galaxy model (E) marginalized over two dimensions (colour contours) and one dimension (blue histograms). For 
every non-linear parameters combination the (M/L) s tars is linearly scaled to fit the data. We assumed ignorant (constant) priors on all model parameters. The 
name of the galaxies is written next to each panel. The symmetrized V rms SAURON data, and the best-fitting model are shown on the right (as in Fig.[TJ. This 
plot illustrates a variety of situations and shapes of the kinematic field: (i) some models (NGC 2685 and NGC 3379) are well constrained within the explored 
parameters boundaries and have preferred slope 7 ss — 1 like NFW; (ii) others (NGC 2774 and NGC3607) prefer a flat 7 ss or even positive values; (iii) 
others (NGC 3193) have nearly unconstrained halo slope; (iv) others (NGC 2549) prefer steep halo slopes at the boundary 7 = —1.6 of our allowed parameter 
range. In all cases there is a strong degeneracy in the halo slope, but the dark matter fraction is tightly constrained by the data to be small (/dm ^ 30% in these 
examples). Only allowing the inner halo slope to match the stellar density 7' —2.0 could significant dark matter be included in some of the models. 



triplets of quantities (xj,yj, Zj), with symmetric errors Axj, Ayj 
and Azj, as the one that minimizes the quantity 

2 = [a + b(xj - asp) + c(yj - y ) - Zj] 2 
x 2^ (bAxrf + {cAy.Y + (Azj) 2 + ei ' ( ' 

Here xo and yo are adopted reference values, close to the middle 
of the Xj and yj values respectively, adopted to reduce uncertainty 
in a and the covariance between the fitted values of a, b and c. 



While e z is the intrinsic scatter in the z coordinate, which is itera- 
tively adjusted so that the \ 2 P er degrees of freedom v = TV — 3 
has the value of unity expected for a good fit. As in the two- 
dimensional case the minimization of equation Q is equivalent to 
the maximization of the likelihood of the data, for an underlying 
probability distribution of the observables described by the relation 
z — a + b(x — Xo) + c(y — yo) + e z , where e z is the dispersion of 
the Gaussian intrinsic scatter in the plane, projected along the z co- 
ordinate, for a uniform prior in the x and y coordinates and assum- 
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ing uncorrelated and Gaussian errors in the x, y and z observables 
(zero covariances). equation ((7} reduces to the so called orthogonal 
plane fit when the measurements errors are ignored and one simply 
assumes Axj = Ayj = Azj . This latter form i s the one gener- 
ally used when fitting the Fundamental Plan e (e.g. Ijorgensen et al] 
ll996| ; |Pahre et al.ll998l ; lBernardi et al.l2003l) . Contrary to the popu- 
lar approach, equation 10 allows for intrinsic scatte r in the relation, 
which is important to deriving unbiased parameters jTremaine et alj 
l2002h . 

Recently iKelrvl ]2007l) proposed a Bayesian method to treat 
the linear regression of astronomical data in a statistically rigor- 
ous manner, allowing for intrinsic scatter, covariance between mea- 
surements and providing rigorous errors on the parameters in the 
form of random draws from the posterior i distribution (s e e also 
iHogg et al.ll201Cl) . He pointed out that the iTremaine et alj J2OO2T) 
approach to linear fitting can lead to biased results in some cir- 
cumstances. For this reason in all our fits we used both the results 
and errors derived from equation {6]l and Q, and the corresp onding 
results obtained with the Bayesian method and software by iKellvl 
which was k indly made available as part of the IDL NASA 
Astronomy Library ]Landsmar]| 19931) . In all cases differences be- 
tween the two method where found to be insignificant, in both the 
fitted values and the errors, confirming the near conceptual equiva- 
lence of the two technically very different approaches. 



3.2.2 Least Trimmed Squares robust fits 

A general issue when fitting linear relations to data using least- 
squares methods is the presence of outliers, which can dominate 
the \ 2 an d bias the parameter recovery. This is the reason why a 
number of previous studies have determined the parameters of the 
Fundamental Plane using the more robust method of minimizing 
absolute instead of squared deviations (e.g. Ijorgensen et alJ[T996l : 
IPahre et aLH l998), at the expense of decreasing the statistical effi- 
ciency, namely larger errors on the fitted parameters. An alternative 
simple solution, which maintains the efficiency of the least-squares 
method for Gaussian distributions, consists of removing outliers by 
iteratively clipping points deviating more than 3a from the cur- 
rently best-fitting relation. A problem with the cr-clipping approach 
is that it is not guaranteed to converge to the desired solution in 
the presence of sign ificant outliers. A lternative robust methods have 
been proposed (see lPress et alJl2007L section 15.7). However, they 
complicate the error estimation and like the standard cr-clipping do 
not always converge. 

After some experimentation with different robust approaches 
the only fully satisfactory solution w e found is the Least T r imme d 
Squares (LTS) regression approach of lRousseeuw & Lerovl dl987l) . 
The reason for its success is that the method, as opposed to other 
robust approaches, finds a global solution. The approach consists of 
finding the global minimum to 



Xh = ^2(r 2 )j-.N, 

2=1 



which however a fast and n early optimal solution (FAST-LT S) has 
recently been proposed by Rousseeu w & Van Driesser] J2006l> . 

In our implementations, which we called LTSXINEFIT and 
LTS_PLANEFIT for the line and plane case respectively, we proceed 
as follows: 

(i) We adopt as initial guess e = for the intrinsic scatter in the y 
(for LTS_LINEFIT) or z coordinate (for LTS_PLANEFIT); 

(ii) We start by default with h = [N + p + l)/2, where p is the data 
dimension, and use the FAST-LTS algorithm to produce a least- 
squares 

fiB to the set of points characterized by the smallest \\ 
(defined by equation[6]or|7]i; 

(iii) We compute the standard deviation a of the residuals for these h 
values and extend our selection to include all data point deviating 
no more than 2.6 a from the fitted relation (99% of the values for a 
Gaussian distribution); 

(iv) We perform a new linear fit to the newly selected points; 

(v) We iterate steps (iii)-(iv) until the set of selected points does not 
change any more; 

(vi) We calculate the \ 2 f° r the fitted points; 

(vii) The w hole process (i)- (vi) is iterated varying e using Brent's 
method dPress et al] |2007. section 9.3) until \ 2 = v. 

(viii) The errors on the coefficients are computed from the covariance 
matrix; 

(ix) The error on e is computed by increasing e until \ 2 = v — y/2v 
(we do not decrease it to avoid problems when e « 0). 

This method was used to produce all fits in this paper and auto- 
matically exclude outliers. Note that although the approach may ap- 
pear similar to the standard a clipping one, and produces similar re- 
sults in simple situations, the key difference is that in LTSXINEFIT 
and LTS_PLANEFIT the clipping is done from the inside-out instead 
of the opposite. This was found to be the essential feature for the 
resulting extreme robustness, which was essential in particular to 
objectively select Virgo members in Fig. Q3] Once th e outliers are 
removed, the same set of points was used as input to lKellvl d2007l) 
Bayesian algorithm. 



3.3 Measuring scaling relations parameters 

3.3.1 Determination of L, R c and T\iifrom the MGE 

Galaxy photometric parameters are generally determined using 
three main approaches: (i) fitting growth curves, where one 
constructs profiles of the enclosed light within circular an- 
nuli and extrapolates the outermost part of the galaxy pro- 
file to infinite r adius, typically using the analytic growth curve 
of the R 1 / 4 |de Vaucouleurj 1 1948}) profile (e.g. the Seven 
Samurai: iBurstein et alj 1 19871 and iFaberetall 1 19891 ; the RC3: 
Ide Vaucouleurs et"al]|l99ll and Ijorgensen et alj|l995al); (ii) fitting 



R 1/n dSersidl 19681) profile (e.g. lGraham & Collessll 19971), pos 



(8) 



where (r 2 )i : jv < (r 2 )2-.N < ••• < (r 2 )N:N are the ordered square 
residuals from the linear regression of a subset of N/2 < h < 
N data points. In other words the LTS method consists of finding 
the subset of h data points providing the smallest Xh> among all 
possible ft-subsets. It's easy to realize that this approach is robust to 
the contamination of up to half of the data points, when h ss N/2. 
This is a computational very expensive combinatorial problem for 



sibly including an exponential disk (e.g. ISaglia et al] 1997I) . to 
the circularized profiles and finding the half-light from the mod- 
els; (iii) fitting flattened two-dimensional models directly to the 
galaxy images, where t he profile of the mo dels is again parame- 
terized by an R 1 ^ 4 (e.e. Bernardi et al. 200 3]), or by an R 1 ^ 4 bulge 
plus exponential dis k (e.g. Gebhardt et al. 2003; Sagli a et al1l2010l : 
iBernardi etai]|2010h . 



2 In all the no nlinear fits the mini mization was performed with the IDL pro- 
gram MP FIT by Markwardt 1 2009), which is in an improved implementation 
of the MINPACK Levenberg-Marquardt nonlinear least-squares algorithm 
by |Moreetal]ll98fj|) 
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Here we have MGE photometric models for all the galaxies 
in the sample based on the SDSS+INT photometry (Paper XXI). 
Due to the large number of Gaussians used to fit the galaxy images, 
the MGE models provide a compact and essentially non-parametric 
description of the galaxies surface brightness, which reproduces 
the observations much more accurately than the simpler bulge and 
disk models, but more robustly than using the images directly. Our 
MGE fitting app roach is in fact analogue to the popular GALFIT 
2002) software, when it is used to match every detail of 



the circularized MGE enclosed within a cylinder of projected radius 
R is then 



MUfc fitting 
( Peng et al.ls 



a galaxy image using multiple components. Here we use the MGE 
models to measure the phot ometric parameters (L and R c ) in our 
scaling relations as done in ICappellari etafl j2009h . A significant 
difference between this MGE approach and all the ones previously 
mentioned is that it does not extrapolate the galaxy light to infinite 
radii. Outside three times the dispersion 3max(<jj) of the largest 
MGE Gaussian, the flux of the model essentially drops to zero. No 
attempt is made to infer the amount of stellar light that we may have 
observed if we had much deeper photometry. For this reason this R c 
must be necessarily smaller than the ones obtained via extrapolation 
to infinite radii. Earlier indications using deeper MegaCam photom- 
etry, which we h ave acquired for many of the galaxies in our sample 
jPuc et alj20lTh [hereafter Paper IX], seems to confirm that R c de- 
terminations have to be used with caution and depend sensitively 
on the depth of the adopted photometry. 

Our method has the advantage that it extracts R c directly 
from the observations: it measures the radius enclosing half of the 
observed galaxy light. The extrapolation method depends on the 
assumed form of the unobservable galaxy profile out to infinite 
radii. This is perhaps sensible for genuine spheroidal and single- 
component elliptical galaxies, which are thought to be well de- 
scribed by ISersid d 19681) profiles. But our volume-limited sample 
of ETGs is dominated by fast rot ators (Paper II; Paper III), char- 
acterized by the presence of disks ( Krajnovic et al. 2012, h ereafter 
Paper XVII) and closely linked to spiral galaxies ( Cappell ari et alj 
l2011bl hereafter Paper VII; Paper XX). Given the variety in 
the outer profiles of spi ral galaxies dvan der Kruit & Searldfl98ll : 
IPohlen & TruiillduOOd) it seems safer to make our results inde- 
pendent on the adopted profile extrapolation, but to use a quantity 
that can be directly inferred from the data. Neither the extrapolation 
method, nor our truncated one can claim to measure the 'true' R c 
or the galaxies. 

If the a>axis is aligned with the galaxy photometric major axis, 
and the coordinates are centered on the galaxy nucleus, the surface 
brightness of an MGE model at the position (V, y') on the plane of 
the sky, already analytically deconvolved for the atmospheric seeing 
effects, can be written as temsellemet"afll 19941) 



E(x',y') = ^ Sj exp 



2 »? 



(9) 



where M is the number of the adopted Gaussian components, hav- 
ing peak surface brightness E,-, observed axial ratio < q' k < 1 
and dispersion at along the major axis. The total luminosity of the 
MGE model is then: 

M M 

L = ^L,=^2^>^, (10) 

j = l 3=1 

where L, are the lumino s ities o f the individual Gaussians. 

In ICappellari et all j2009h the effective radius of the MGE 
model was obtained by circularizing the individual Gaussians of 
the MGE, while keeping their peak surface brightness. This was 
achieved by replacing (oj , q'j) with (a y/qj, 1) • The luminosity of 



M 

L(R) = ^L j 



exp 



R 2 



3 H 3 



(ID 



The circularized effective (half-light) radius R c was found by solv- 
ing L(R) = L/2, using a quick interpolation over a grid of 
log R values. When the MGE has constant axial ratio q'j = q 
for all Gaussians, this approach finds the circularized radius R c — 
\fah — a^/lf of the elliptical isophote containing half of the ana- 
lytic MGE light, where a is the major axis of the isophote. This is 
the quantity almost universally used for studies of scaling relations 
of ETGs. When the axial ratios of the different Gaussians are not all 
equal, the approach finds an excellent approximation for the radius 
R c — y/ A e /-K of a circle having the same area A e as the isophote 
containing half of the MGE light. In fact we verified that for all the 
MGE of the ATLAS 3D sample the two determinations agree with 
an rms scatter of just 0. 17% and only for four of the flattest galaxies 
the difference is larger tha n 3%. 

Hopki ns et al. I d2010l) pointed out the usefulness of adopting 
as size parameter the major axis i?™ ax of the half-light isophote in- 
stead of the circularized radius R c , when analysing results of simu- 
lations. The motivation is that i?™ ax is more physically robust and 
less dependent on inclination. Here we also calculate i?™ ax for our 
observed galaxies as follows. 

(i) We construct a synthetic galaxy image from the MGE using equa- 
tion (|9), with size maxf^) x max(ffj) (only one quadrant is 
needed for symmetry); 

(ii) We sample a grid of surface-brightness values Hk = jj,(xk,0) 
along the MGE major axis, and for each value we calculate the light 
enclosed within the corresponding isophote; 

(iii) We find the surface brightness /x e of the isophote containing half 
of the analytic MGE total light by solving L(/i) — L/2 using linear 
interpolation; 

(iv) i?™ ax is the maximum radius enclosed inside the isophote fi e (the 
largest x coordinate). 

We also calculate the circularized effective radius of the isophote 
R c = yj A/tt of area A and t he effective ellipticity e e of the MGE 
model inside that isophote as ( ICappellari et al . 2007) 



(1 



<y!> 



(12) 



where Ft is the flux inside the k-th image pixel, with coordinates 
(xk, Vk) and the summation extends to the pixels inside the chose 
isophote. A similar quantity was calculated from the original galaxy 
images in Paper III, but we use here this new determination for max- 
imum consistency between our e e and the ellipticity of the MGE 
models in the tests of Fig. [4] 

We studied the dependence on inclination of the two defini- 
tions of effective radii using the photometry of real galaxies. For 
this we selected the 26 flattest galaxies in our sample, all having 
axial ratio q' < 0.4. These galaxies are likely to be close to edge- 
on. We assume they are exactly edge-on and we then use the MGE 
formalism (equations l9l 1 13landl 14b to deproject the surface bright- 
ness and calculate the intrinsic luminosity density. We then project 
it back on the sky plane at different inclinations, from edge on 
(i — 90°) to face on (i — 0°). At every inclination we calculate the 
two effective radii R c and i?™ ax (Fig. |5). The comparison shows 
that, as expected, the R c of flattened objects can be much smaller 
when objects are edge-on than face-on, with a median decrease of 
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Different effective rodii definitions 
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Figure 4. Different definitions of R c as a function of the galaxy ellipticity. 
The red filled diamonds are the projected radii R c of a cylinder with the 
same area of the half-light isophote. The blue filled circles are the radii r-^ 
of a sphere with the same volume as the half-light iso-surface. In both cases 
the radii are normalized to i?™ ax , which is the projected semi-major axis of 
the half-light isophote, having ellipse of inertia of ellipticity e e . The red and 
blue dashed lines are the relations /(e e ) = and /(e e ) = y/e^ 

respectively. The horizontal dashed line marks the theoretical value 4/3, 
which approximately applies to a number of simple theoretical profiles. 



fitting inclination of the JAM models. A possible deprojection of the 
observed MGE surface brightness can be deriv ed analytically by de - 
projecting the individual Gaussians separately dMonnet et aljl992h. 
The s olution is only unique when the galaxy is edge-on dRvbickil 
Il987h . The deprojected luminosity density v is given by 



2najqj 



exp 



2a] 



R 2 + - 



(13) 



where the individual components have the same dispersion <jj as in 
the projected case {5), and the intrinsic axial ratio of each Gaussian 
becomes 



Qj = 



(14) 



where i is the galaxy inclination (i = 90° being edge-on). To cal- 
culate ri/2 from the intrinsic density of equation U3\ one can pro- 
ceed analogously to the approach used to measure the circularized 
R c . This is done by making the three-dimensional MGE distribu- 
tion spherical, while keeping the same total luminosity and peak 
luminosity density of each Gaussian. This is achieved by replac- 
ing (it, , qj) with (<r q^ 3 , 1). The light of this new spherical MGE 
enclosed within a sphere of radius r is given by 



L(r) 



M 



[erf(/ij) — 2hj exp(— hj )/\/t?\ , 



(15) 
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Figure 5. Inclination dependence for different definitions of the effective 
radius. The red lines show the change in the measured circularized R c , nor- 
malized to the face-on value, when the inclination is changed from edge-on 
(i = 90°) to face-on, for the 26 flattest ATLAS 3D galaxies. The blue di- 
amond marks the median (43%) of the maximum variation. The blue lines 
show the same variation with inclination of the major axis i?" mx of the half- 
light isophote. The red circle is the median (5%) of the maximum variation. 



43% (0.24 dex). The opposite is true for i?™ ax , but the variations are 
dramatically smaller, with a median increase of 5% (0.02 dex). The 
two effective radii of course are the same for intrinsically spherical 
objects. The use of 7?™ ax instead of R c is especially useful when 
one considers that 86% of the galaxies in ATLAS 3D (and in the 
nearby Universe) are disk like (Paper II, III and VII). 

In what follows we also need the radius ri/2 of a sphere en- 
closing half of the galaxy light. For this we need to derive the in- 
trinsic galaxy luminosity density from the MGE, assuming the best 



with hj = r l(\f2~Oj ) and erf the error function. And the half- 
light spherical radius rv/ 2 is obtained by solving L(r) = L/2 by 
interpolation. As in the projected case, when all Gaussians have 
the same qj — q, which means the density is stratified on similar 
oblate spheroids, the method gives the geometric radius r 1 / 2 = 
(abc) 1/3 = aq 1/3 , where a is the semi-major axis of the spheroid. 
While when the qj are different, this radius provides a very good 
approximation to the radius r lf >2 = [3V e /(47r)] 1 ^ 3 of a sphere that 
has the same volume V e of the iso-surface enclosing half of the total 
galaxy light. 

In Fig. [4] we compare the three definitions of R a as a func- 
tion of the observed effective ellipticity e e of the MGE, for all the 
galaxies in the ATLAS 3D sample. Even though the galaxy isophotes 
are in most cases not well approximated by ellipses, and the galax- 
ies are intrinsically not oblate spheroids, the ratio between R c and 
7?™ ax follows the relation for elliptical isophotes. When the galax- 
ies are very close to circular on the sky R c and i?™ ax agree by defi- 
nition. The situation is very different regarding the relation between 
ri/2 and _R™ ax . In this case, when the galaxy is edge-on, there is a 
simple ratio ri/^/Re ~ 1.42, but when the galaxies have lower 
inclinations, large variations in the ratio are possible, so that ri/2 
cannot be inferred from the observations, without the knowledge 
of the galaxy inclination, which generally require dynamical mod- 
els. The situation is of course much simpler for spherical objects, in 
which case rx / 2 / R e ~ \A2 as in the edge-on case. For comparison 
iHernquisl |l990) found the theoretical value r 1 / 2 /Re ~ 1-33 for 
his spherical models, while lCiottH i 1 99 ll) has shown that for a R}l m 
model the ratio is confined between 1.34—1.3 6, when m = 2 — 10, 
and the same applies to other simple profiles dWolf et ai]|201fj|) . As 
expected our ratio is slightly larger, given that our models, like real 
galaxies, do not extend to infinite radii. For flatter models the cylin- 
drical and spherical circularized radii are approximately related as 
7?c/^?™ ax = x/eT, which one would expect for elliptical isophotes 
while the ratio ri/2/Re remains approximately constant. 
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3.3.2 Comparing effective and gravitational radius 

For an isolated spheric al system in steady state o ne obtains from the 
scalar virial theorem (Binney & Tremaine 2008) 



Spherical half-light and Gravitational radii 



M 



G 



(16) 



where r g is defined as the gravitational radius, which depends on 
the total and luminous mass distribution, M is the galaxy total lu- 
minous plus dark mass and {v 2 ) !X> is the mean-square speed of the 
galaxy stars, integrated over the full extent of the galaxy. In the 



spherical case (w 2 )oo = 3(cr 1 2 os ) 00 



and 



M : 



)(°fos)c 

G 



(17) 



This formula is rigorously independent of anisotropy and only 
depends on the radial p rofiles of luminous and dark matter 
teinnev & Tremainell2008l section 4.8.3). 

When the spherical system is self-consistent i_L(r) oc M(r)) 
the gravitational radius can be easily calculated as 



2L 2 



f °°[L(r)/r]Mr 



(18) 



Here we evaluate this expression using a single numerical quadra- 
ture via equation J 1 5b . from the same spherical deprojected MGE 
we used in the previous Section to calculate r 1( / 2 . The MGE is 
obtained by deprojecting the observed surface brightness at the 
JAM inclination and subsequently making the MGE spherical while 
keeping the same peak stellar density and luminosity of every Gaus- 
sian. In this way our calculation of r g is rigorously accurate when 
the MGE is already spherical, while the formula provides a good 
approximation for flattened galaxies. 

In Fig.[6]we plot the ratio r-i/2/r g , for the full ATLAS 3D sam- 
ple as a function of th e non-paramet r ic Th ird Galaxy Concentra- 
tion (TGC) defined in iTruiillo et all d200ll) as the ratio between 
the light L(R C ) = L/2 enclosed within an isophote of radius 
R c an d the one L(R e /3) e nclosed within an isophote with radius 
i? e /3. iGraham et ai] d200lh have shown that this choice leads to a 
more robust me asure of concentration than popular alternatives (e.g. 
iDoi et al.|[l993l) . We compute the TGC from the circularized MGE 
using equation i ll It . as done for R c . We find a trend in the ratio for 
the galaxies in our sample that varies between r 1 / 2 /r g ~ 0.3 — 0.4 
for the range of galaxy concentrations we observed. For comparison 
we also calculate the TGC and the corresponding r g for spherical 
models described by the R}^ m profile dSersidl968h . This was done 
by constructing analytic profiles, truncating them to R < 4i? c , to 
mimic the depth of the SDSS photometry, befor e fitting them wit h 
the one-dimensional MGE-fitting procedure of Capp ellaril d2002l) . 
Both TGC and r 1 / 2 /r g span the ranges predicted for profiles with 
m — 2 — 6. Our trend in the ratio is more significant than the gen- 
erally a s sumed near constancy around 0.40 ± 0.02, first reported by 
Spitzer] d 19691) for different polytropes, which agrees wi th the the- 
oretica l value ri/ 2 /r = (1 + y/2)/6 m 0.402 for a lHernquistl 
dl99d) profile dMamonl I2OO0I ; iLokas & Mamonl l200ll) . However, 
the variation is indeed rather small, being only at the ±15% level 
around a median value of 0.35 in our sample. 

The relatively small variations of the ratio between the gravi- 
tational and intrinsic T\i% or projected R c half-light radii, explain 
the usefulness of the latter two parameters in measuring dynamical 
scaling relations of galaxies. This fact, combined with the rigorous 
independence on anisotropy, also explains the robustness of a mass 
estimator like 
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Figure 6. The black filled circles mark the ratio j-j/2 /V a between the radius 
of the half-light sphere and the gravitational radius for all the galaxies in 
the sample. For comparison the solid red line indicates the same ratio for a 
spherical galaxy with an R 1 '™ surface brightness profile. From left to right 
the red diamonds mark the locations m = 1,2, 4, 6, 8, 10 respectiv ely. The 
green dashed horizontal line indicates the theoretical value for a lHemquistl 
fl990h profile. 



M1/2 = k 



ri/2Wos)c 



(19) 



when the stellar systems can be assumed to be spherical and kine- 
matics is available over the entire extent of the system, as pointed 
out by Wol f et al.l d2010l) . Assuming the measured ratio r 1 / 2 /r g ~ 
0.4 for galaxies with the approximate concentration of an R 1 ' 4 pro- 
file, already in the self-consistent limit the expected coefficient is 
k w 3/0.4/2 = 3.75, which is close, but 2 5% larger than th e 
corresponding coefficient k = 3 proposed by Wolf et alj d201fj|) . 
However, the ratio ri/2/rg we empirically measured on real galax- 
ies, does not assume the outermost galaxy profiles are known and 
can be extrapolated to infinity, so it weakly depends on the depth of 
the photometry. For example, for a spherical galaxy that follows the 
R 1 ' 4 profile to infinity, we obtain r 1 / 2 /r g = 0.456, which would 
imply k = 3.2 9 in the se l f-cons istent limit. The remaining 10% 
difference from I Wolf etafl d20ld) is easily explained by the small 
increase of (of os )oo due to the inclusion of a dark halo. 



3.3.3 Determination ofa e 

Unfortunately the quantities (U\ os )oo, or (of os )oo are currently 
only observable via discrete t racers in objects li ke nearby dwarf 
spheroidal (dSph) galaxies (e.g JWalker et al . 2007), but it is still not 
a directly observable q uantity in early-type galaxies. Nonetheless 
ICappellari et al. I d2006t) showed that in practice {vf os ) e , as approxi- 
mated by a e , which can be empirically measured for large samples 
of galaxies, can still be used to derive robust central masses when 
applied to real, non-spherical ETGs, with kinematics extended to 
about lR a : 



RoQ- 2 
GL 



(M/L)[r = JJc]^5.0x^, (20) 

where (M/L) [r — R c ] is estimated inside an iso-surface of vol- 
ume V = 47r7? e 3 /3 (a sphere of radius R c if the galaxy is spher- 
ical), and o~ e is the velocity dispersion calculated within a pro- 
jected circular aperture of radius 7? e . In this paper we improve 
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on the previous approach by measuring a e inside an effective el- 
lipse instead of a circle. The ellipse has area A — -kR\ and ellip- 
ticity e e . The measurement is done by co-adding the luminosity- 
weighted spectra inside the elliptical apert ure and measuring the 
a of that effective spectrum using pPXF dCappellari & Emselleml 
2004). Due to the co-addition, the resulting spectrum has ex- 
tremely high S/N (often above 300) and this makes the mea- 
surement robust and accurate. When the SAURON data do not 
fully cover R e we co rrect the a e to IR C using equation (1) of 
Capp ellari et all d2006h . a E has the big advantage over (vf os ) e that 
it can also be much more easily measured at high redshift, as 
it does not require spatially resolved kinematics. Integrated stel- 
lar velocity disp ersions have started to be c ome measurable up to 
redshift z sa 2 dCenarro & Truiilldl2009t ICappellari et alj|2009l: 
van Dokkum et al J 120091 ; lOnodera et al. 201Ct van de Sande et alj 



201 lh . Moreover the advantage of cr e over the traditional central 
dispersion cr c , is that it is empirically closer to the true second ve- 
locity moment (vf OB )oa that appears in the virial equation l |17| l and 
is directly proportional to mass. Making the good approximation 
(M/L)[r = Re] x (M/L)[r = r 1/2 ], where r 1/2 « 1.33.R C , one 
can rewrite equation ((20} in a form that is directly comparable to 
equation l |19t 
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r l/2< 



(21) 



G G 
Note that the empirical coefficient 1.9 is significantly smaller than 
the value around 3.0 one predicts when using {(r? OB )aa in equa- 
tion ([19} and we will come back to this point in Section |4~3l 
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Figure 7. Testing the relative accuracy of size measurements. Comparison 
between the R c from 2MASS plus RC3, matched to RC3 as described in Pa- 
per I, and the R c from the MGEs. For a good match the MGE values have 
been increased by a significant factor 1.35. In what follows the effective 
radii will always already include this multiplicative factor. The coefficients 
of the best-fitting relation y = a + b(x — xo) and the corresponding ob- 
served scatter A in y are shown at the top left of the plot. The two red 
dashed and dotted lines mark the la bands (enclosing 68% of the values 
for a Gaussian distribution) and 2.6<r (99%) respectively. The outliers auto- 
matically excluded from the fit by the LTS.LINEFIT procedure are shown as 
green diamonds. 



4 RESULTS 

4.1 Uncertainty in the scaling relations parameters 

4.1.1 Errors in L, R c and a 

In the study of galaxy scaling relations formal errors on L, R c and a 
are often adopted, as given in output by the program used for their 
extraction. These errors assume the uncertainties are of statistical 
nature. However, in many realistic situations the systematic errors 
are significant, but difficult to estimate. In this work, the availability 
of a significant sample of objects, with similar quantities measured 
via independent data or methods, allow for a direct comparison of 
quantities. This external comparison permits us to include system- 
atic errors into our adopted errors, instead of just using formal or 
Monte Carlo errors. 

In Paper XXI we compare the total magnitude M r of the MGE 
model, as derived from the SDSS+INT r-band photometry to var- 
ious other sources in the literature. We conclude that our total Mr- 
are accurate at the 10% level, in the relative sense. This is the error 
we adopted in what follows. This accuracy is comparable to other 
state-of-the-art photometric surveys. 

A comparison between the circularized half-light radii R e of 
Paper I and the circularized R c from the r-band MGE is shown in 
Fig. [7] In this case the scatter is of 0.058 dex, which would im- 
ply errors of 10% in the individual R a . This must still be a firm 
upper limit to the errors, given that any relative variations, among 
galaxies, in the colour gradients in r and K s will increase the scat- 
ter. Remarkably in this case our scatter between SDSS r-band and 
2MASS K s bands, for the entir e sample, i s as sm all as the best 
agreement (0.05 dex) reported bv lChen et al. ( 2010), using the very 
same SDSS g-band photometry. We are not aware of other pub- 
lished independent R a determinations from different data that agree 



with such a small scatter, and for such a large sample. The rms scat- 
ter we measure is twice smaller that their comparison in the same 
band between SDSS and ACSVCS. Our scatter is also twice smaller 
than a similar comparison we performed in Paper I between the R c 
of 2MASS and RC3. We interpret the excellent reproducibility of 
our MGE R c values, and the agreement with the values of Paper I, 
to the fact that in both 2MASS and our MGE models the total lu- 
minosities are not computed via a extrapolation of the profile to 
infinity, but truncated to the extent of the data. This result is a re- 
minder of the fact that extrapolation is a dangerous practice, which 
should be avoided whenever possible. We argue that it is difficult to 
derive reproducible results on galaxy sizes, when one assumes the 
outer (unobservable) profile is accurately know. So for this work 
define R c as the radius containing half of the observed light, not 
half of the ill-defined amount of total light we think the galaxy may 
have. Of course even our approach does not solve the problem of 
determining an absolute normalization of R c , and our sizes appear 
well reproducible only in a relative sense. 

However, a very important feature of Fig. [7] is the significant 
offset by a factor 1.35 between the MGE R c and the values of Pa- 
per I, with the MGE values being smaller. In what follows all our 
MGE effective radii will always already include this multiplicative 
factor. The values of Paper I wher e determine d from a combination 
of 2M ASS dSkrutskie et"ai]|200d) and RC3 dde Vaucouleurs et alj 
R c measures. But they were scaled to match on average the 
values of the RC3 catalogue, which were determined using growth 
curves extrapolated to infinity. The RC3 nor malization agree within 
5% with the SAURON determinations in dCappellari et alj 120061 : 



iKuntschner et alj2006l . lFalc6n-Barroso et al.l201 lh . Part of the 1.35 
offset is simply due to the extrapolated light in an r 1//4 profile, 
outside the region where our galaxy extend on the SDSS or INT 
images. But the source of the remaining offset is unclear and con- 
firms the difficulty of determining R c . For comparison in Paper I we 
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showed that the 2MASS and RC3 values correlate well, but have an 
even more significant offset of a factor 1.7! 

Various comparisons of the accuracy of kinematic quantitie s 
have been performed in the literature (e.g. Emsellem et al.|[2004h . 
The general finding is that the measurements of the galaxies veloc- 
ity dispersion can be reproduced at best with an accuracy of « 5%, 
mainly due to uncertainties in the stellar templates and various sys- 
tematic effects that are difficult to control. Here in Fig. [8] we test 
the internal errors of our kinematic determination by comparing a e 
against the velocity dispersion measur ed withing a circular ape rture 
of radius 7? = 0.87 kpc as adopted in lJorgensen et al.l dl995bh . We 
measure an rms scatter of A = 0.027 dex between the two quanti- 
ties, which corresponds to a la error of 4.5% in each value. The two 
values do not measure the same quantity, as the two adopted aper- 
tures and fitted spectra are different, and for this reason both the 
actual velocity dispersion and the stellar population change in the 
two pPXF fits. For this reason the observed scatter provides a firm 
upper limit to the true internal uncertainties in <r e . However, in what 
follows we still assume a conservative error of 5% in a e and uo.87, 
to account for pos sible systemati cs. The same choice w as made e.g. 
in iTremaine et alj d2002l) and ICappellari et al. We further 

compared our <To.87 values againts the literatu re a compilation in 
the HyperLEDA database JPaturel et al]|2003l) . for 207 galaxies in 
common with our sample. A robust fit between the logarithm of the 
two quantities eliminating outliers with LTSXINEFIT gives an ob- 
served rms scatter of 9% (A = 0.037 dex), likely dominated by the 
heterogeneity of the HyperLEDA values, and no significant offset 
(2%) in the overall normaliztion. Apart from placing a very firm 
upper limit to our errors, this provides an external estimate of the 
typical uncertainties in the HyperLEDA values. 



4. 1.2 Errors in mass or M / L 

To obtain an estimate of our m ass and M/L er r ors fo r the full 
sample, we proceed similarly to Cappellari et al. I J2006h . namely 
we compare mass determinations using two significantly different 
modelling approaches. In Section [3.1.2|we described the six mod - 
elling approaches that were presented in Cappel lari et al. I d2012bl) 
and we also use in this paper. For this test we compare the two 
very different set of models: the self-consistent model (A) and the 
models (B) which include a NFW halo with mass as free parame- 
ter. For the model with NFW halo we then compute the (M/L) e = 
[M/L](r — R a ) by numerically integrating the luminous and dark 
matter of the models. The total M/L enclosed within an iso-surface 
of volume V = 4ttR c 3 /3 is defined as follows 



[M/L](r = i? e ) 



L{R C ) x (M/L) 

stars 
L(R e ) 



(22) 



where A/dm is the mass in the dark halo. This quantity is compared 
with the (M/L) jam of the self-consistent model in the top panel 
of Fig. [9] The agreement is excellent, with an rms scatter A = 
0.030 dex, consistent with errors of 5% in each quantity. This value 
is nearl y the same as th e value of 6% we estimated as modelling 
error in Cappellari et al. I d2006l) and confirms the original estimate 
of the random modelling uncertainties. There is no evidence for any 
significant trend or systematic offset. 

Importantly this result clarifies two misconceptions regard- 
ing the use of self-consistent models to measure the M/L inside 
r ~ R e in galaxies. S elf-consistent models, like the one used in 
Cappe llari et alj ( 2006), do not significa ntly underestimate th e to- 
tal M/L as it is sometimes stated (e.g. iDutton et alj 1201 lbl , sec- 
tion 3.7). Even though the model with dark halo has a total galaxy 
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mass typically an order of magnitude larger inside the virial ra- 
dius, and has a dramatically different mass profile at large radii, the 
model still measures an unbiased total M / L within a sphere of ra- 
dius r « R e , corresponding to the projected extent of the kinemat- 
ical data. The robustness in the recovery of the enclosed total mass, 
in the region constrained by the data, even in the pres ence of degen- 
eracie s in the halo profile, was already pointed out by Thom aTet alj 
J2005h and is demonstrated here with a much larger sample. 

Of course the self-consistent (M/L) jam is larger than the 
purely stellar one (M/I/) sta rs if dark matter is present, according 
to the relation 



(M/L) jam w [M/L](r = R c 



(M/L). 



(23) 



1 - J"dM (f = -Re) ' 

where the fraction of dark matter contained within an iso-density 
surface of mean radius R c is defined as 

Aft)M (Rc) 



/dm(?" = -Rc 



(24) 



L(R C ) X (M/L) st ars + Mdm(-Rc) ' 

The difference between (M / L) jam and the stellar M/L inferred 
from population models can then be used to give quantitative con- 
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Figure 9. Accuracy of M/L and mass. Top Panel: Same as in Fig. for 
the comparison between the (M/L) jam of the best-fitting self-consistent 
(total mass follows light) models, and the (M/L) e , integrated within an 
iso-surface of volume V = 47riJ c 3 /3 (for a spherical galaxy a sphere of 
radius r = R c ), including the contribution of both the stellar and the dark 
matter component. Except for some outliers due to inferior data, there is no 
bias between the two determinations, which are consistent with an intrinsic 
scatter of 5% in each quantity. Bottom Panel: Same as the top panel for the 
comparison between the total mass of the self-consistent JAM model and 
twice the mass within the half-light iso-surface, for the model with dark 
matter halo. 



st raints on the dark matter content and the form of the IMF, as done 
in Cappellari et al. I d2006h . Moreover the self-consistent models do 
not imply or require the da rk mass to be neg ligible inside r as R c 
as sometimes stated (e.g. iThomas et all 1201 lb . Even for galaxies 
with inferred /dm ~ 50%, due to inferior data (see later), the 
total (luminous plus dark) M/L within IR C is still accurately re- 
covered by the simple self-consistent models. This makes the self- 
consistent models well suite d to determine unbiased total M/L 
within lR e at high redshift d van der Marel & van Dokku^|2003; 
Ivan der Wei & van der Marell 120081 ; ICappellari et alj|2009h . where 
high-quality integral-field stellar kinematics still cannot be obtained 
and dark matter fractions cannot be extracted. 

Using integral-field data the error in this measure of enclosed 
mass is as small as the one that can be obtained from strong lens- 
ing studies. The important difference between the two techniques 
is that the lensing results measure the total mass inside a projected 
cylinder (or elliptical cylinder), while the stellar kinematics gives 



the total mass inside a spherical (or spheroidal) region. The lens- 
ing mass should be larger than the dynamical one if dark matter is 
present in the galaxy. The difference between these two quantities 
provides a measure of the dark matter content along the LOS and 
can be exploited to get some constraints on the dark matter profiles 
dThomas et alj|201 lllPutton et alfeOl lal) . 

For completeness we also show the accuracy in the recovery 
of the enclosed mass in the bottom panel of Fig. [9] as this is what is 
usually presented. This figure contains essentially the same amount 
of information as Fig. [9] given that the luminosities are identical 
on both axes, so the values still differ just by the M/L. This figure 
explicitly illustrates that with good accuracy 



(M/ i) JAM w 2 X M 1/2 . 



(25) 



The JAM models with dark halo additionally provide an es- 
timate of the dark matter fraction /dm (equation i24\ ) enclosed 
within the region constrained by the data r — R c . For the galax- 
ies where our kinematics does not cover lR e , our /dm will be 
more uncertain. The results is presented, as a function of galaxy 
stellar mass M s t ars in Fig. QJj] for the set of models (B), with a 
NFW halo, with mass as free parameter, and for the set of models 
(E), which have a cosmologically-motivated NFW halo, uniquely 
specified by M s tars- We find a median dark matter fraction for 
the ATLAS 3D sample of /dm = 16% for the full sample and 
/dm = 12% for the best models (B) and 17% with models 
(E). These value are broadly consistent, but on the lower limit, 
with numerous previous stellar dynamics determinatio ns inside 1 R e 
from m uch smaller samples and larger uncertainties: iGerhard et al] 
( feOOll) found /dm = 10 — 40% from sph erical dynamical mod- 
elling of 21 ETGs; ICappellari et alj |2006) inferred /dm ~ 30% 
by comparing dynamics a nd population masses of 25 ETGs, and 
assuming a universal IMF; IThomas et all d2007bL [201 l|) measured 
/dm = 23 ± 17% via axisymmetric dynamical models of 16 ETGs; 
Willi ams et ail d2009h measured a median fraction /dm = 15% 
with JAM models of 15 ETGs, as done here, bu t with more extended 
stellar kinematics to as 2 — 3 Re ; The results o f lTortoraetal1 ( l2009h 
are not directly comparable, as they used spherical galaxy toy mod- 
els and inhomogeneous literature data from various sources, how- 
ever they are interesting because they explored a sample of 335 
ETGs, comparable to ours, and report a typical /dm = 30% by 
comparison with stellar population. 

The quite small /dm that we measure is also consistent with 
the fact that the strong lensing analysis of the about 70 galaxies of 
the SLACS sample iBolton et al] j200ot) finds a logarithmic slopes 
for the total (luminous plus dark matter) density close to isother- 
mal. Subsequent re-analyses of their data all c onfirmed a trend 
Ptot(r) oc r ~ 20 , with an intrins i c scatter of as 0.2 (jKoopmans et al] 
I2006L 120091 : lAuger et alf2010al ; iBarnabe et alj|201il) . In Fig.l2lwe 
derive the same slope and intrinsic scatter for the stellar density 
alone, inside a sphere of radius r = R B . This fact seems to suggest 
that dark matter does not play a significant role in galaxy centres 
and that the measured isothermal density slope is essentially due 
the stellar density distribution. Only a very steep dark matter slope 



close to isothermal pdm ( 



like the average stellar distri- 



bution could allow for significant dark matter fractions, while still 
being consistent with these observations. We are not aware of any 
theoretical or empirical evidence for these very steep dark matter 
cusps in galaxies. 
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Figure 10. Dark matter fraction for ATLAS 3D galaxies. The open circles 
indicate the fraction /dm °f dark matter enclosed within the iso-surface of 
volume V = 47riJ c 3 /3 (in the spherical case within a sphere of radius R c ), 
for the best-fitting JAM models, as a function of the galaxy stellar mass 
Mstars inferred by the models. The black symbols are for the subset of 163 
galaxies with the best models and data, while the red symbols indicate less 
impressive model fits (e.g. due to bars, interactions or low inclination) or 
inferior data. The Top Panel corresponds to the results for model (B), with a 
NFW halo having mass as free parameter. The median is /dm = 16% for 
the full sample and /dm = 12% for the best models. In a number of cases 
the model without dark matter is preferred. The solid green line indicates 
the median for six bins of mass. All significant /dm values seems just spu- 
rious results of inferior data or modelling problems. The Bottom Panel is the 
same as the top one, for the set of models (E) which has a cosmologically- 
motivated NFW halo, uniquely determined by M s tars (see text for details). 
The median /dm = 17%. The blue line is a robust parabolic fit to all the 
data, which has best-fitting parameter written in the figure. The difference 
between these two panels likely illustrates our uncertainty in the individ- 
ual dark matter fractions. The robust result is that dark matter fractions for 
halo slopes as steep as NFW or more shallow is small, with /dm < 29% 
(/dm < 18%) in 90% (68% [la]) of the good models. 



4.2 The classic Fundamental Plane 

Since the discovery of the Fundamental Plane (FP) relation be 
tween luminosity, size, and velocity dispersion, in samples of 
local elliptical galaxies (IF 
iDiorgovski & Davisl \lWm . 
to the determination of the FP parameters either including fainter 



Faber et ail 119871 : iDressler ( 
numerous studies have been devoted 



samples of 
etal.lll987l : 



galax ies dNieto et ak I ll990h . f ast rotating ones dPrugniel & Simienl 
1994), or lenticular galaxies djorgensen et alj fl996h . The depen- 
dency of the FP paramet ers have been investigated as a function 
of the phot ometric band dPahre et al.ll 19981 : IScodeggio et alj|l998h 
or redshift ( Ivan Dokkum & Franxlll996r ). Mor eover galaxy samples 
of more tha 10 4 galaxies have been studie d dBernardi et alj|2003l : 
iGraves et alj|2009l: iHvde & Bernardill2009l) . In this section, before 
presenting our result, we study the consistency of our FP parameters 
with previous studies. 

Nearly all previous studies have used as variables the loga- 
rithm of the effective radius R a , the effective surface brightness E e 
and the (central) velocity dispersion a. One of the reasons for this 
choice comes from the emphasis of the FP for distance determina- 
tions. Both E e and a are distance independent, so that all the dis- 
tance dependence can be collected into the R c coordinate by writing 
the FP as 



log R c = a + b log a + clog E e 



(26) 



In the top panel of Fig. QT] we present the edge-on view of our 
ATLAS 3D FP, obtained with the LTS_PLANEFIT routine, where 
we use as velocity d isper sion <j e (Secti on 13.3.3b as done in 
ICappellari et alj (2006) and lFalcon-Barroso et all ( 1201 ll) . but here 
measure within an elliptical isophote. Our best-fitting parameters 
b = 1.048 ± 0.041 and c = -0.746 ± 0.023 are formally quite 
accurate, but significantly different from what is generally found by 
ot her studies: the media n of the 1 1 determinations listed in table 4 
of iBernardi et alj J2003I) is b — 1.33 and c = —0.82, with an rms 
scatter in the values of at = 0.12 and a c = 0.03. The observed 
scatter we measure A « 0.092 i n log R e is very close to what has 
been found by other studies (e.g. lorg ensen et alj fl996 find 0.084). 

To understand the possible reason of this disagreement we test 
the sensitivity of our estimate to the sample selection and the size 
of the kinematical aperture used for the a determinations. For this 
we measure the velocity dispersion o"o.87 inside a circular apertur e 
with radius R — 0.87 kpc, as done by Jorge nsen et alj (l995b). 
We also select the massive half of our sample by imposing a selec- 
tion o"o. 87 > 130 km s _1 . The resulting FP is shown in the middle 
panel of Fig. QT| and now both the fitted values and the observed 
scatter agree with previous values. For comparison we also show 
in the bottom panel of Fig.[TT]the determination of the FP parame- 
ters, when using a e instead of cto.87, but keeping the same selection 
of the massive half of our ATLAS 3D sample a e > 130 kms" 1 . 
These values are also consistent with the literature. This illustrates 
the importance of sample-selection and a extraction in the deriva- 
tion of FP parameters. The increase of b as a function of the lower a 
cut-off of the selectio n is fully consistent w i th the same finding by 
Gargi uloet alj j2009t) and lrlvde & Bernardil j2009t) and we refer the 
reader to the latter paper for a more complete study of the possible 
biases in the FP parameters due to sample selection. The reason for 
the sensitivity of the FP parameters to the selection, is a result of the 
fact that the FP is not a plane, but a warped surface, as we demon- 
strate in Paper XX by studying the variation of the (M/L) jam on 
the VP. So that the FP parameters depend on the region of the sur- 
fa ce one includes in the fi tting. This was also tentatively suggested 
bv lD'Onofrio et alj J2008I) . 

Having shown that with our sample and method we can de- 
rive results that are consistent and at least as accurate as previous 
determinations, we now proceed to study the Virial Plane, by re- 
placing the traditionally used stellar luminosity with the total dy- 
namical mass. We call it in this way, because we will show it is en- 
tirely and accurately explained by the virial equ i librium con dition. 
A similar study was performed bv lBoltonetal] d200ll2008l) , and 
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Figure 11. Classic fundamental plane. Top Panel: edge-on view of the FP 
for all the ATLAS 3D galaxies. The coefficients of the best-fitting plane 
z = a + bx + cx and the corresponding observed scatter A are shown at 
the top left of the plot. The two dashed lines mark the Icr bands (enclosing 
68% of the values for a Gaussian distribution) and 2.6<r (99%). The outliers 
excluded from the fit by the LTS.PLANEFIT procedure are shown with green 
symbols. The errors are the projection of the observational errors, excluding 
intrinsic scatter. Middle Panel: Same as in the top panel, with <ro.g7 mea- 
sured within a circle of 0.87 kpc. Only galaxies with <ro.87 > 130 km s — 1 
are included. Bottom Panel: Same as in the top panel using cr e , but with 
Only including galaxies with cr e > 130 km s — . 



updated by Auge r et al] ( feOlOal) . using masses derived from strong 
lensing analysis. They call their plane the "Mass Plane". Although 
our studies are closely related, we use a different term to empha- 
size the fact that, while the lensing masses are measured within a 
projected cylinder of radius 7? = Re/2, parallel to the LOS, and 
for this reason they include a possible contribution of dark matter 
at large radii, our dynamical masses are measured within a sphere 
of radius r = R e . Th is aspect was not made sufficiently clear in 
ICappellarietafl l l2006l) and we try to avoid possible confusion here. 

4.3 From the Fundamental Plane to the Virial Plane 

The classic form for the FP is ideal when the FP is used to determine 
distances. However, a different form seems more suited to studies 
where the FP is mainly used as a mass or M/L estimator. For this 
we rewrite the FP as 

Here we normalized the <r e and R c values by the approximate me- 
dian of the values for our sample, to reduce the covariance in the fit- 
ted parameters and the error in a. Using L instead of E e has the ad- 
vantage that it reduced the covariances between the pairs of observ- 
ables (E e , R e ). Here in fact, as opposed to when E e = L/(2-KR e 2 ) 
is used, there is no explicit dependence between the three axes, 
which become independently measured quantities. The new fit to 
the FP is shown in the top panel of Fig. [T2] In agreement with all 
previous authors the fitted parameters are very different from the 
values b — 2 and c = 1 expected in the case of the virial equa- 
tion J20b . The relation shows a negligible increase in the observed 
rms scatter, from A = 0.092 dex (24%) to A = 0.10 (26%). This 
may be due to the smaller covariances between our input measure- 
ments: the new scatter is now a better representation of the true 
scatter in the FP relation. 

In the bottom panel of Fig. [12] we show for comparison the 
relation obtained by replacing the total galaxy luminosity with the 
dynamical mass M1/2, which represent the total luminous plus dark 
matter enclosed within a iso-surface enclosing half of the galaxy 
light. In practice in the plots we show 

A'/jAM =LX (Af/ L) JAM W 2 X Ml/2 « Altars, (28) 

where (M/L)jam is the total (luminous plus dark) dynamical 
M/L obtained using self-consistent JAM models (A), L is the to- 
tal galaxy luminosity and M1/2 is the total mass within a sphere of 
radius r i/ 2 enclosing half of the total galaxy light, w here r 1( / 2 fa 
1.337? e dHernauistl 1990l : ICiottJ 199ll : IWolf et alfeoict Fig.©. The 
correctness of the A/jam ~ 2 x M1/2 approximation is illustrated 
in the bottom panel of Fig. [9] While the 2 x M1/2 ~ Af sta rs ap- 
proximation is due to the relatively small amount of dark matter en- 
closed within r = Ti/2 (Fig. I lot. This is only approximately true, 
but much larger errors are generally made when determining stellar 
masses from stellar population models, due the assumption of a uni- 
ver sal IMF, which was recently shown not to represent r eal galax- 
ies dvan Dokkum & Conroy|2010l ; ICappellari et alj2012bl) . None of 
our conclusions is affected by the last approximation, which only 
serves to allow for comparisons of our results to previous similar 
studies that use stellar mass as parameter. 

Two features are obvious from the plot: (i) There is a dra- 
matic reduction of the observed scatter from A = 0.10 (26%) to 
A = 0.062 (15%). This shows without doubt that a major part of 
the scatter in the FP is due to variations in the M/L, in agreemen t 
with independent results from strong lensing dAuger et al.ll2Qlo3) ; 
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(ii) The b coefficient substantially increase and is now much closer 
to the virial value b — 2, while the c coefficient remanins nearly 
unchanged. This confirms that much of the deviation of the FP 
from the virial predictions is due to a systematic variation in Mj L 
along the FP, not to non-homology, al so in agreement with previ- 
ous dynami cal ( Cappellari et al. 2006) and strong lensing results 
teolton et al.ll2008ljAuger et alj|2010al) . 

The result of this exercise clearly shows that the existence of 
the FP is entirely due to the fact that galaxies can be remarkably 
well approximated by virialized stellar systems, with a relatively 
smooth variation of M/L. T hese facts have b een clearly realized 
since the discovery of the FP l lFaber et al.l 19871) and have been gen- 
erally assumed in most recent studies. The new findings on the tilt of 
the FP agree with a similar study of scaling relations in ETGs using 
accurate dynamical m odels and integral-field kinematics of a sam- 
ple of just 25 galaxies (Cappellari et al. 2006) and with independent 
confirmations using stro ng gravitational lensing dBolton et alj200l 
120081 ; lAuger etalj|2010ah . Galaxy non-homology has a minor ef- 
fect at best, when the determination of galaxy scaling parameters is 
pushed to the maximum accuracy and an attempt is made to remove 
the most important biases. 

The level of accuracy at which the simple virial approximation 
holds is not entirely expected however, given the apparent complex- 
ity of galaxy photometry and kinematics. Of course the dynamical 
models assume equilibrium and rigorously satisfy the virial equa- 
tions. One may think that a tight relation is a necessary feature of 
the approach. This is however not correct. It is true in fact that the 
models satisfy the scalar virial equation 2T + W — by construc- 
tion, where T is the total kinetic energy and W is the total poten- 
tial energy. However, given the complex multi-component nature 
of galaxies, the presence of bars, the importance of projection and 
the fact that the potential energy should include dark matter, it is far 
from obvious that one should be able to define any simple empirical 
measure of projected radius on the galaxy, and a measure of velocity 
dispersion within a limited region, so that the virial equation can be 
written in the simple form M1/2 = k a 2 R/G (designed for spheri- 
cal homologous systems), with fixed exponents and nearly constant 
coefficient for the entire population! 

In Fig. Q/3] we present a direct comparison between the new 
JAM M/L estimates within an iso-surface with volume V = 
47T-R e 3 /3 and the the simple virial estimate of equation J20t from 
Cappe llari et alj l l2006l) . Considering the modelling errors of 5% in 
M/L estimated in this paper, we infer an error of 15% in the virial 
estimates. This shows that, although the virial estimates do not suf- 
fer from strong biases, they provide errors about a factor 3 larger, 
even when using our good data. 

Our finding does not see m to agree with the s mall system- 
atic offsets recently reported by IThomas et all J20T ]]). We suspect 
the disagreement may be an effect of small sample statistics and 
larger errors, given that they studied only 16 objects and did not 
use integral-field data. However, the difference they find may also 
be simply due to a systematic difference in their R c determination, 
with respect to the SAURON ones. Our ne w strong empirica l confir - 
mation of the scaling of the coefficient in lCappellari et a[| §006), 
even in the presence of dark matter, also emphasizes the importance 
of using virial coefficients that are calibrated to the extent of the 
availab le kinematic data. The coefficient k = 3.7 5 given by[Spitzer 
Jl969l> or k = 3 proposed bv lWolf et all feo id) for equation (19) 
should not be used to estimate central masses in early-type galaxies, 
where stellar kinematics out to at most a couple of R c is available 
and the corresponding value k « 1.9 of equation i2U applies. The 
difference of the two coefficients is due to the fact that, while the es- 



timator o flWolf etalj < |2010|) is a theoretical one, designed for spher- 
ical geometry, very extended kinematics, a nd assumes galaxy pro - 
files are known to infinite radii, the one bv lCappellari et al.l |2006) 
is an empirical one, designed for quantitative measures of masses in 
the central regions of ETGs. Both estimators are useful in their own 
range of applicability, but they should not be used interchangeably, 
unless one can tolerate systematic biases of « 60% in the absolute 
mass normalization. 

We stress here the importance and the difficulty of obtaining 
effective radii that are as accurate as the model calibration allows. 
Ultimately the general unreliability and poor reproducibility of ef- 
fective radii determined from photometry of different quality is the 
main limiting factor to a quantitative use of the scalar virial rela- 
tions to measure accurate masses or M/L, when a proper absolute 
normalization is essen tial, like in IMF studies of distant galaxies 
JCappellari et al .1 12009). If different methods or extrapolations, ap- 
plied to different, but high-quality photometric data of local galax- 
ies, can produce revisi ons in R e by as much a s a factor of two 
(Kormendy et al. 2009; see also lChen et ai]|201Cl) . more significant 
biases should be expected when comp aring local and high- redshift 
observations, as already pointed out by Mancini et alj u2010h . When 
biases in R e are present, only dynamical models can still pro- 
vide robust central masses and M/L, due to the near insensitiv- 
ity of the models to the shape of the outer mass and light profiles 
Jvan der Marel & van D okkum 2 0071 : Ivan der Wei & van der Marell 
l2008l ; ICappellari et al.ll2009l) . 
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Figure 13. Accu racy of the simple viria l estimate. Comparison between the 
virial estimate of Cappellari et al. (2006) and the more accurate JAM values. 
The inferred rms errors in the estimation of M/L are 17%. Symbols and 
lines are as in Fig. llll 



Figure 12. From the Fundamental Plane to the Virial Plane. Top Panel: Edge 
on view of the FP. Symbols and lines are as in Fig.QT] Bottom Panel: Edge- 
on view of the MP. Note the decrease in the scatter, when making the sub- 
stitution L — > M, and the variation in the coefficients, starting to approach 
the virial ones b = 2 and c = 1. 
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4.4 The (M/L) - a e relation 

In the previous sections we showed that the existence of the Fun- 
damental Plane can be accurately explained by the virial relation 
combined with a smooth variation of the M/L. Here we study 
the previously reported correlation (M/L) oc a®' 8 (in the I- 
band) between the effective velocity disp ersion and the dynam- 
ical M/L within a sphere of radi us R e JCappellari et alj 120061 : 
Ivan der Marel & van Dokkunj l2007l) . This relation was previously 
found to provide the tightest relation among other parameters of 
scaling relations (dynamical mass, luminosity or size), with an ob- 
served scatter of 18% and an inferred intrinsic one of just ~13%, 
when using integral-field kinematics. 

The (M /L) - cr e relation for the full ATLAS 3D sample is 
shown in the top-left panel of Fig. [14] Our new relation has an ob- 
served scatter of 29%, from which we infer an intrinsic scatter of 
22%, when combining our 5% errors in the models with the dis- 
tance errors for the various subsamples as described in section 2.2 
of Paper I. We adopted as distance errors the median one for each 
given class of determinations reported in Paper I, instead of the in- 
dividual errors, which are not easy to trust in every case, and that are 
likely dominated by systematics. The scatter is significantly larger 
than the previously reported one. The new relation has a formally 
accurate power slope of b = 0.688±0.043, which is a bit shallower 
than the previous one, based on a sample ten times smaller than the 
current one. 

To understand the reason for the differences between our 
(M/L) — a e slope and previous determinations, in the top-right 
panel Fig. [14] we plot the (M/L) — a e relation for the subset of 
78 gala xies with SBF d i stance s fro mlTonrv et alj (l200ll). as done 
in bot h ICappellari et al. I J2006h and Ivan der Marel & van D okkuml 
fc007h . The relation for this subset now steepens and becomes fully 
consistent with the previous dete rminations. The rea son for this is 
likely related to the fact that the iTonrv et al. I feooih subsample is 
biased towards elliptical galaxies, which tend to be the brightest 
in our sample. A change in slope is then expected from the curva- 
ture of the (M/L) — <r e relation, which is not clearly visible in our 
range of a values, but is implied by the deviations from our relation 
when other classes of objects with smaller of larger a are consid- 
ered dZaritskv et aflkOOa [20o3 : iTollerud et alj|201 ll) . A small but 
systematic increase in the slope is indeed visible when we select 
subsamples within different a ranges from our ATLAS 3D sample. 
We conclude that the discrepancy between our newly fitted value 
and the previous works is due to the difference in the sample selec- 
tion. The present sample is not only much large than the one used 
in previous studies, but also volume-limited so it provides a sta- 
tistically representative view of the scaling relations in the nearby 
Universe. 

In the middle-left panel of Fig.[l4]we show the (M/L) — a e 
of the 36 slow rotator ETGs defined in Paper III. We confirm a de- 
tectable offset in the relation, with the slo w rotators having slightly 
larger M/L, as previously reported in Cappellari et al. (2006). 
However, the difference is just at the 9% level. There is also a 
change in the slope, with the slow rotators defining a more shal- 
low relation that the full population. We also confirm the s maller 
scatter in the relation, as reported by iFalcon-Barroso et alj d20 1 lh 
for the colour-cr and FP relations. The slow rotators have an ob- 
served scatter of 22%, and an inferred intrinsic one of 15% in the 
(M/L) — <7 e relation. This is likely due to the fact that significant 
amounts of cold gas and star formation, which affect the M/L but 
not <r, are in fast rotators (Paper IV, McDermid et al. in prepara- 
tion). The relation for the fast rotators (middle-right panel) agrees 



with the global one, as expected from the fact that they dominate 
the ATLAS 3D sample. 

The dependence of the slope and zero point of the (M / L) — a e 
relation on environment effects is shown in the bottom panels of 
Fig. [14] As discussed in Paper VII, most of the environmental dif- 
ferences in the ATLAS 3D sample can be characterized by whether 
a galaxy belongs to the Virgo cluster or not. The left panel shows 
the 58 ATLAS 3D galaxies in Virgo. They follow the same shallow 
relation as the slow rotators, but with the zero point of the global re- 
lation. The observed scatter decreas es to just 14%, in part due to the 
accurate distances from ACSVCS jMei et alj[200l) . However, the 
intrinsic scatter A(M/L) also further decreases t o just 10%. This 
is con sistent with the intrinsic scatter measured bv lCappellari et al. 

(2006) , using a radically different set of models and different dis- 
tance estimates (no ACSVCS), but on a sample that, contrary to the 
ATLAS 3D sample, was dominated by Virgo galaxies. The decrease 
in the scatter mus t be related to the decr ease in the fraction of young 
objects in Virgo (Kun tschner et alj 20 id : McDermid et al. in prepa- 
ration). It again confirms that the scatter of the (M/ L) — a e relation 
is dominated by stellar population effects, as previously demon- 
strated for the FP. The two results are two ways of looking at the 
same thing, given that the (M/L) — cr e relation is the projection 
of the differences between the FP and VP along the cr e axis. For 
completeness we also show in the bottom-right the relation for non- 
Virgo galaxies, which dominate the sample and again are consistent, 
albeit a bit steeper, than the global relation. 

In the top panel of Fig.[T5]we show how the tightness of the 
(M/L) — a e relation can be used to cleanly select galaxies belong- 
ing to the Virgo cluster. Here we selected all ATLAS 3D galaxies 
contained within a cylinder of radius of 7? = 12° centred on the 
Virgo cluster (approximately at the location of the galaxy M87) and 
assigned to all of them the cluster distance of D = 16.5 Mpc from 
iMei et alj {2007). We then used the LTS.LINEFIT routine to fit a 
line. Even in the presence of 20 dramatic outliers out of 79 objects, 
the method is able to robustly converge to a clean relation^ The 
method selects 59 galaxies within the 99% (2.6<r) confidence bands 
from the best-fitting relation. The plot reveals a tight sequence in 
the (M/L) — a e , which corresponds to galaxies in the Virgo clus- 
ter, with an observed scatter of A(M/L) = 0.063 (16%). It is 
reassuring to see that this relation, which uses no individual dis- 
tance information for the galaxies, agrees both in the slope and ze- 
ropoint with the ones for all ATLAS 3D galaxies, even though it has 
smaller scatter. Galaxies above the relation lie in the background of 
Virgo, and their difference in distance modulus from Virgo is 2.5 x 
the difference in log(M/L) from the best-fitting relations. In this 
fit we assume that the distance error are due to the la d epth of the 
Virgo cluster. Adopting the value of o\d = 0.6±0.1 from lMei et alj 

(2007) we derive an intrinsic scatter in M/L of e M / L = 0.039 dex 
(9%). 

When w e select only the galaxies with SBF distances from 
the ACSVCS dMei et alj|2007h (bottom panel of Fig.[B](, we find 
a relation with the same slope, but a decreased observed scatter of 
A(M/L) = 0.047 (11%). For this relatively small, but still statis- 
tically significant sample of 32 galaxies, the inferred intrinsic scat- 
ter in M/L would be a mere 8%! Considering that ETGs appear 
to have very small fractions of dark matter in their central region 
(Fig. |10t, a small scatter in dynamical M/L should be expected 



3 Other robust method like (i) minimizing the absol ute deviation, (ii ) us- 
ing iterated biweight estimates or (iii) M-estimates ("Press et al. 200j sec- 
tion 15.7), failed to provide a sensible solution to this problem. 
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Figur e 14. The (M /L) e — cr a relation. From left to right and from top to bottom the relation is shown (i) for all ATLAS 3D galaxies; (ii) for the subset in 
iTonrv etal (iii) for the subset of slow rotators (from Paper III); (iv) for the subset of fast rotators (from Paper III); (v) for the subset of galaxies in the 

Virgo cluster; (vi) for subset not in the Virgo cluster. In all plots the blue symbols are fast rotators , while red symbols are slow rotators. Green symbols are 
outliers excluded from the fit by LTS.LINEFIT. 
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Figure 15. Scatter in the (M/L) e — cr e relation in the Virgo galaxy cluster. 
Top Panel: All ATLAS 3D galaxies within 12° of the center of the Virgo 
cluster have been assigned a fixed distance of D = 16.5 Mpc. The mea- 
sured M/L naturally defines a clean (M/L) e — a e relation for galaxies 
belonging to the cluster. The scatter in this relation is due to a combination 
of the cluster depth and the intrinsic scatte r in the re l ation. Bottom Panel: 
(M/L) e — <7 e relation for the galaxies in Mei et al. (2007). The accurate 
distances produce a quite significant decrease in the scatter, down to just 
11%, indicating that both the (M / L)jam and the SBF distances are signif- 
icantly more accurate than this value and confirming that the SBF distances 
are able to resolve the spatial structure of Virgo, along the LOS, as claimed. 



fro m the extreme tightn ess of the colour-magnitude relation in clus- 
ters dBower et al. 1 19921) and specifically for the ACSVCS galaxies 



dChen et alJ uOlO) given that colour is a direct tracer of the M/L 
of the stellar population jBell & de Jongll200lh . Our small scatter 
finding confirms the remarkable accuracy of the ACSVCS SBF dis- 
tances and their ability to resolve the cluster structure as claimed. It 
shows that the intrinsic (M/L) — a e relation is extremely tight, but 
its study is limited in our sample by the distance errors. It would 
be valuable to perform a similar analysis as in the top panel of 
Fig. Q3] with integral-field data and accurate models, in a cluster 
like Coma, sufficiently close that good stellar kinematics can be 
obtained, but sufficiently far that errors in the distance can be virtu- 
ally ignored. The smaller intrinsic scatter inferred for this sample, 
with respect to the one in the top panel, suggests that, either they 
ar e not drawn from the same population, or the ACSVCS sample 
in lMei et"al] d2007t) spans a slightly smaller set of distances within 



the Virgo cluster, than the ATLAS 3D Virgo sample. The tightness of 
this correlation also places tight constraints on the possible intrinsic 
scatter on the IMF — a trend that we discuss in Paper XX. 



4.5 Relation between a c and the maximum circular velocity 

Previous studies ( Zaritsky et al ] |2006l 2008; McGaugl TetaLll2010l ; 
lDuttonetal]|2011bl) have tried to unify dynamical scaling rela- 
tions of spiral galaxies and early-type galaxies. For spirals one 
can measure the rotat ion velocity of the gas, which appears in the 
lTullv&Fisheil ( ll977l) relation between galaxy luminosity (or mass) 
and its maximum (asymptotic) circular velocity max(V c irc), typi- 
cally measured from the kinematics of the neutral gas at large radii. 
For early-type g alaxies one can measure the velocity dispersion, 
which enters the iFaber & Jackson! d 19761) and Fundamental Plane 
relations. Unification of the scaling relations is done by convert- 
ing velocity dispersion into the circular velocity Kirc(-Re lax ) at 
the half-light radius or into the maximum one max(Kh-c) adopt- 
ing constant factors. 

Typical conversion facto rs for Kirc(J?™ ax ) us ed in the liter- 
ature range from %/2 to % /3 dCourteau et alj [20071) . For example 
IPadmanabhan et al] d2004l) estimates k <v 1.65. While lSchulz et al] 
20ld) adopts k sa 1.7 and lDutton eTafl d201 lbl) uses k w 1.54. 

Our dataset provides accurate a e for all galaxies, together with 
circular velocities from our dynamical models. This allows for a 
robust empirical calibration of the relation. The correlation between 
o e and V c irc(.R™ ax ) is shown in Fig.[l6]and the best-fitting relation 
has the form 



Kirc(iC aX ) « 1-51 x a e 



(29) 



Considering the variety of photometric profile and galaxy flattening 
in our complete sample of ETGs, it is remarkable that the relation 
has a scatter of just 8%, with a weak dependency on <r e . 

Even slightly tighter is the correlation between cr e and 
max(14i rc ), which has the form 



max(K:ir 



1.76 x cr e 



(30) 



and an observed scatter of 7%. Importantly this relation show es- 
sentially no trend with <j e . It is worth noting that the max(V c j rc ) 
defined here is the peak in the rotation curve within the region 
where we have stellar kinematics, which is generally within IR C . 
This value should not be confused with the asymptotic value of 
the circular velocity at large radii, w hich is gene rally used in the 
iTullv & Fisheil d 19771) relation (but see lDavis et alj201ll). Although 
the th e so-called bulge-halo conspiracy Ivan Albada & Sancisil 
11986) generally ten d to make the two values similar (e.g. see 
Williams et afl l2009l) , this fact has never robustly established for 
a significant sample of ETGs. 

As shown in Fig. [T7] the maximum in the circular velocity 
is almost always reached well inside 17? c , with 85% of the peak 
Vdic happening at a radius within R c /2 and a median radius of just 

Re/5. 



5 SUMMARY 

We construct detailed dynamical models (JAM), based on the Jeans 
equations and allowing for orbital anisotropy, for the volume- 
limited and essentially mass-selected ATLAS 3D sample of early- 
type galaxies. The models fit in detail the two-dimensional galaxy 
images and reproduce in detail the integral-field stellar kinematics 
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Figure 16. Circular velocity V c i rc versus <r e . Top Panel: correlation be- 
tween the circular velocity V c i IC (lR c ) inferred from our models at IR C , 
and a e . Bottom Panel: correlation between the peak circular velocity 
max(V c i rc ) (within 1-R C ) and cr e . 
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obtained with SAURON out to about IR C , the projected half-light ra- 
dius. We derive accurate total mass-to-light ratios (M /L) e and dark 
matter fractions /dm, within a sphere of radius r = R c centred on 
the galaxies. We infer masses A/jam = L X (M/L) e « 2 x M 1 / 2 , 
where M1/2 is the mass within a sphere enclosing half of the galaxy 
light. We also measure stellar (Af/L) s tars- 

We test the accuracy of our mass determinations by running 
models with and without dark matter and we find that the enclosed 
total (M / L) e is a robust quantity, independent of the inclusion of a 
dark-matter halo, with an rms accuracy of 5% and negligible bias. 
In other words, even using simple mass-follow-light models, one 
recovers the total enclosed (M/L) e with good accuracy and small 
bias. We illustrate the tecniques we use to measure radii and global 
kinematical quantities from our data, and to robustly fit linear re- 
lations or planes to the data, even in the presence of outliers and 
significant intrinsic scatter. We stress the difficulty of measuring 
absolutely calibrated effective radii R c , and we argue againt ex- 
trapolation in the profiles, for more reproducible results. System- 
atic offsets in R c determinations are the main limitation for the use 
of the scalar virial relation for mass estimates, and may affect size 
comparisons as a function of redshift. 

We find that the thin two-dimensional subset spanned by 
galaxies in the (A/jam, cr e , i?™ ax ) coordinates system, which we 
call the Virial Plane (VP) has an observed rms scatter of 17%, 
which would imply an intrinsic one of just 4%. The VP satisfies 
the scalar virial relation Mjam = 5.0 x a%R™ ax /G within our 
tight errors. However, this is only true if one pays special atten- 
tion to the methodology employed to determine the galaxy global 
parameters and in particular, (i) one uses as scale radius the major 
axis i? c max of the 'effective' isophote enclosing half of the total 
projected galaxy light (without extrapolating the profile beyond the 
data), and (ii) one measures the velocity dispersion o e (which in- 
cludes rotation and random motions) from a spectrum derived in- 
side that effective isoph ote. This confirms with unprecedented ac - 
curacy previous claims ( Cappella riet~ai]|2006l: iBolton et alj |2008) 
that galaxies accurately satisfy the virial relations and that the exis- 
tence of the FP is entirely explained by virial equilibrium plus some 
intrinsic variations in the total (A//L) e . 

We revisit the (M/L) e — a relation and measure a shallower 
observed slope than previously reported. The difference is due to the 
selection of the sample of galaxies previously used to fit the rela- 
tions. We find that the correlation depends both on galaxy rotation 
and environment, in the sense that both for the subsamples of the 
galaxies in Virgo, or for the subsample of slow rotators, the relation 
is more shallow and has a reduced scatter. In the best case, when 
the most accurate distances are used, the observed scatter drops to 
10% and the intrinsic one is estimated to be a mere 6%. 

We study the correlation between a e and the circular velocity 
from the dynamical models. We find that V c irc(fl™ ax ) ~ 1-51 x a e 
and max(Kirc) ~ 1.76 xcr e . The relations have an observed scatter 
of 7-8% and show very little dependence on a e . 

The accurate global dynamical scaling parameters for the 
ETGs in the ATLAS 3D sample are used in the companion Paper XX 
to explore different projection of the Virial Plane and the variation 
of galaxy physical parameters. 



Figure 17. Histogram for the distribution of the radius R/R c at which the 
maximum circular max(V c i rc ) is reached, as a fraction of the galaxy effec- 
tive radius R c . 
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